Hello Alex,

congratulations for the SoC project!

For the other people subscribed to the list, here's the Trac site for
the project:

Here's my notes about the classifier:

You want to work using extended package data, which is great!  If you
don't have a local mirror and need some data mining tool to run on one,
please let us know and we'll try to help.

One of the main uses I see of the bayesian classifier is not only to
assign tags, but to do some limited QA on tags submitted by people using
the website or debtags-edit.  It might be an instance of the same
problem (if the predictor would like to add the tag, too, then we
automatically accept it), or there could be some extra interactions
between the two tasks (for example, submissions by trusted people could
be used as traning data if traning data is needed).

A big bonus would be to have reusable code: the classifier could be used
in various places, such as: in the website or debtags-edit to suggest
tags to add; in a script run by cron, to validate tag submissions; when
receiving tag submissions, to separate trivial ones from the ones that
need manual review; possibly more depending on how the classifier will
actually work and in which ways it will be possible to abuse it for
other means :)

But besides the notes, congratulations again for getting the project
approved, and thanks to Erich for writing the proposals and taking care
of all the SoC business.



