[Teammetrics-discuss] Changes to liststat.py over the last days.

Sukhbir Singh sukhbir.in at gmail.com
Thu Sep 8 18:37:50 UTC 2011


Hi,

Some notable changes to liststat.py:

+ Better handling of encoding errors. We now use chardet to detect the
encoding in case it can't be detected by decode_header. This has
resulted in handling false negative cases. For the encoding errors
that remain, the encoding cannot be detected automatically or the
message is spam (usually the case).

+ We were skipping multipart messages earlier (e.g. - PGP signed
messages), but now we are including them and taking care of the
encoding (double win!).

This new code needs to be tested more rigorously. So when you are back
and have time, please do a `git pull` and then run liststat.py against
some non-English lists on blends.d.n and let it run completely and
then we will see how our new detection is working.

-- 
Sukhbir



More information about the Teammetrics-discuss mailing list