[Teammetrics-discuss] Name fixing (Was: NNTPStat completed successfully.)

Andreas Tille andreas at an3as.eu
Wed Nov 9 16:47:25 UTC 2011


On Wed, Nov 09, 2011 at 05:24:28PM +0530, Sukhbir Singh wrote:
> Interesting find. To find other names, I did this:
> 
>     select count(*) from listarchives where name ilike '<%'; count
>     -------
>       4681
>     (1 row)

This was quite probably after having found the example with my name.
 
>     select count(*) from listarchives where name ilike '''%';
>      count
>     -------
>         77
>     (1 row)

Same here.
 
> So clearly, there are other such names. When checking the message from
> lists.d.o web interface, I see that these are the messages that have a
> 'From'  header like:
> 
>     From: <foo at bar.com>
> 
> instead of
> 
>     From: Foo Bar <foo at bar.com>
> 
> ... the name is missing. So there is no nothing we can do in this case
> because constructing a name from an email address is not possible.
> 
> However, we can strip the `<` and `'` characters from the 'Name' field
> which will make:  'Andreas Tille' and Andreas Tille equal.

Yes, this seems like a very reasonable approach.  I guess we could
probably do the same for '"' characters (untested - but I bet you will
find those as well).
 
> For the cases of: tillea at rki.de ;  tille at debian.org, again the same
> problem is there.
> 
> So, to summarize:
> 
> 1. I will strip the `<` and `'` characters.

Yes + '"' (if found).

> 2. For cases where the name == the email address, there is nothing we
> can do except add entries manually.

Yes.  We put name = email into the database and fix the names we are
interested in in the updatenames.py script (as we do anyway - there is
no better chance to do).
 
> We have another bug in our code, something that I feel stupid about!
> 
> I just noticed when investigating the above problem, we are storing
> the email address in the form of:
> 
>     name at domain.com
> 
> Stupid me! I don't know came into my mind that I had this line in liststat:
> 
>     email_addr = email_raw.replace('@', ' at ')
> 
> :(
> 
> I will fix all these issues and then push them.

Well, things like this just happen and we are currently not using
email anyway - so no real harm is done. :-)
 
Kind regards

       Andreas. 

-- 
http://fam-tille.de



More information about the Teammetrics-discuss mailing list