Further ideas for Debtags AI

Alex de Landgraaf alex at delandgraaf.com
Tue Jun 13 22:52:31 UTC 2006


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi Erich,

Erich Schubert wrote:
>> - - User selects if the selected tags should be added to or removed from
>> the selected packages
> 
> I'd love to see the AI assist here. It would be cool if the user
> selects a group of packages and the tags-to-add list magically fills
> itself.

Neat idea! Would have to make it clear that these are merely suggestions
though. Also, evaluating a package for all tags would take some time.
But it would be a great feature :)

> Yes. But that might require some serious reworking of the database,
> too... :-(
> We really need the database and admin/review work. I had hoped we
> might get a second SoC project for these though. I'd really appreciate
> it, but it might be a bit outside of your main focus, the AI. However,
> some things might be needed by you anyway (e.g. some database) so it
> might work out just fine.

What was required for the database rewrite, any pointers? Although the
AI is indeed my main focus, the summer is long and I don't mind helping
out in other areas. Especially if I'll end up doing it anyway in order
to help improve the reviewing.

> preselected to approve/reject? yeah, I think thats a good idea.
> 
> However, don't combine that too much with above plan: if the same AI
> suggests tags and reviews them, that is more likely to be error prone,
> actually. These should be done with different algorithms and/or
> training data to avoid such effects.
> When stuff is preselected, the admin is less likely to thoroughly
> check them. A bad suggestion by the AI tagger with a high score might
> sneak in this way again and again.

Yup, overfitting would indeed be a problem.

As a start we could use different thresholds (0.8 for suggesting, 0.9
for pre-approving, for example). Once I get around to the other
algorithms we can do some further experimentation in this area. Maybe
the AI tagger could randomly take a subset (half or so) of the packages
to train on each week, instead of the whole deal.

cheers,
Alex
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFEj0GvQeuQA5TF/UsRAgtUAJ95qhDBeiJWWesbSSyXws3rJkylIgCfZHCg
IPOGMPZ+W7OQZN47yqwhv5s=
=6v8x
-----END PGP SIGNATURE-----




More information about the Debtags-devel mailing list