[Teammetrics-discuss] Phase I: Statistics for mailing lists on Alioth

Andreas Tille andreas at an3as.eu
Mon May 30 06:28:45 UTC 2011


On Fri May 27 Sukhbir Singh wrote:
> 
> As we had decided initially, we will be parsing the mbox archives that
> are generated by Pipermail, the message archiver that comes with
> Mailman (the lists on Alioth run on Mailman). There is some sample
> code [0] that parses a mbox archive and outputs the statistical data
> we need. You need to pass a mbox archive to the script but otherwise
> this code is good to go as far as our requirement is concerned.

Fine.  We just discussed that the next step would be to download the
mboxes of interest automagically and put the data into a database.
 
> For our specific case, we will have to download the mbox archive (per
> each list) locally and then parse it. As an example, consider the
> blends-commit mailing list [1]. The archives for the blends-commit
> mailing list are from May 2011 - February 2009. For each mailing list,
> I will parse the HTML and download all the gzip files which of course
> will correspond to the months the list was active. That way we will
> parse and download only what is required.

Please note that this mailing list has a specific history:  Due to a
rename from CDD to Blends the mboxes which can be found here

    http://lists.alioth.debian.org/pipermail/cdd-commits/

should be regarded as input from the same team.

> I need your your thoughts on this approach. Is this OK or we can do
> better? Note that the aim is to automate the entire process. There is
> no information that the we need to give the program; it does
> everything itself.

The plan sounds reasonable - nothing to add for the moment.
 
> I reinvented the wheel (thanks to Andreas for allowing me to do this!)

:-) No reason to thank for this.

Kind regards

        Andreas.

-- 
http://fam-tille.de



More information about the Teammetrics-discuss mailing list