[Teammetrics-discuss] Name fixing (Was: NNTPStat completed successfully.)
    Andreas Tille 
    andreas at an3as.eu
       
    Wed Nov  9 16:47:25 UTC 2011
    
    
  
On Wed, Nov 09, 2011 at 05:24:28PM +0530, Sukhbir Singh wrote:
> Interesting find. To find other names, I did this:
> 
>     select count(*) from listarchives where name ilike '<%'; count
>     -------
>       4681
>     (1 row)
This was quite probably after having found the example with my name.
 
>     select count(*) from listarchives where name ilike '''%';
>      count
>     -------
>         77
>     (1 row)
Same here.
 
> So clearly, there are other such names. When checking the message from
> lists.d.o web interface, I see that these are the messages that have a
> 'From'  header like:
> 
>     From: <foo at bar.com>
> 
> instead of
> 
>     From: Foo Bar <foo at bar.com>
> 
> ... the name is missing. So there is no nothing we can do in this case
> because constructing a name from an email address is not possible.
> 
> However, we can strip the `<` and `'` characters from the 'Name' field
> which will make:  'Andreas Tille' and Andreas Tille equal.
Yes, this seems like a very reasonable approach.  I guess we could
probably do the same for '"' characters (untested - but I bet you will
find those as well).
 
> For the cases of: tillea at rki.de ;  tille at debian.org, again the same
> problem is there.
> 
> So, to summarize:
> 
> 1. I will strip the `<` and `'` characters.
Yes + '"' (if found).
> 2. For cases where the name == the email address, there is nothing we
> can do except add entries manually.
Yes.  We put name = email into the database and fix the names we are
interested in in the updatenames.py script (as we do anyway - there is
no better chance to do).
 
> We have another bug in our code, something that I feel stupid about!
> 
> I just noticed when investigating the above problem, we are storing
> the email address in the form of:
> 
>     name at domain.com
> 
> Stupid me! I don't know came into my mind that I had this line in liststat:
> 
>     email_addr = email_raw.replace('@', ' at ')
> 
> :(
> 
> I will fix all these issues and then push them.
Well, things like this just happen and we are currently not using
email anyway - so no real harm is done. :-)
 
Kind regards
       Andreas. 
-- 
http://fam-tille.de
    
    
More information about the Teammetrics-discuss
mailing list