[Teammetrics-discuss] Name fixing (Was: NNTPStat completed successfully.)
Andreas Tille
andreas at an3as.eu
Wed Nov 9 16:47:25 UTC 2011
On Wed, Nov 09, 2011 at 05:24:28PM +0530, Sukhbir Singh wrote:
> Interesting find. To find other names, I did this:
>
> select count(*) from listarchives where name ilike '<%'; count
> -------
> 4681
> (1 row)
This was quite probably after having found the example with my name.
> select count(*) from listarchives where name ilike '''%';
> count
> -------
> 77
> (1 row)
Same here.
> So clearly, there are other such names. When checking the message from
> lists.d.o web interface, I see that these are the messages that have a
> 'From' header like:
>
> From: <foo at bar.com>
>
> instead of
>
> From: Foo Bar <foo at bar.com>
>
> ... the name is missing. So there is no nothing we can do in this case
> because constructing a name from an email address is not possible.
>
> However, we can strip the `<` and `'` characters from the 'Name' field
> which will make: 'Andreas Tille' and Andreas Tille equal.
Yes, this seems like a very reasonable approach. I guess we could
probably do the same for '"' characters (untested - but I bet you will
find those as well).
> For the cases of: tillea at rki.de ; tille at debian.org, again the same
> problem is there.
>
> So, to summarize:
>
> 1. I will strip the `<` and `'` characters.
Yes + '"' (if found).
> 2. For cases where the name == the email address, there is nothing we
> can do except add entries manually.
Yes. We put name = email into the database and fix the names we are
interested in in the updatenames.py script (as we do anyway - there is
no better chance to do).
> We have another bug in our code, something that I feel stupid about!
>
> I just noticed when investigating the above problem, we are storing
> the email address in the form of:
>
> name at domain.com
>
> Stupid me! I don't know came into my mind that I had this line in liststat:
>
> email_addr = email_raw.replace('@', ' at ')
>
> :(
>
> I will fix all these issues and then push them.
Well, things like this just happen and we are currently not using
email anyway - so no real harm is done. :-)
Kind regards
Andreas.
--
http://fam-tille.de
More information about the Teammetrics-discuss
mailing list