Further ideas for Debtags AI

Tue Jun 13 22:11:27 UTC 2006

Hello Alex,
> starting with the 3rd week of June it'll finally be summer and I'll make
Fine with me. I'll be busy as well.

> BTW, you said in one of your emails that you wanted some UI mockups. Do
> you have any preference on having this web-based (ajax \o/) or GTK-based
> (debtags-edit-like)?

I thin ajax would be nicer, but I consider Ajax to be dirty... GTK is
fine with me, so I leave that up to you. But for the intended users,
web would be better. Doesn't need to be very ajaxish, though.

> Was thinking along the lines of a web-based interface frontend:
> - - User selects one or more packages (with auto-completion)
> - - User selects one or more tags (ditto.)
> - - User selects if the selected tags should be added to or removed from
> the selected packages

I'd love to see the AI assist here. It would be cool if the user
selects a group of packages and the tags-to-add list magically fills
itself.

> - - User either choses to submit changes (mail to central DB) or evaluates
> changes against AI tagger (each combination is evaluated, choice to
> submit changes afterwards)

Thats useful, too. However the evaluation functions is probably even
more valueable for reviewing tags, than for doing the first
submission.

> Also wanted to get the approval-process at least partially online, as
> the current way to approve or deny submissions isn't very efficient

Yes. But that might require some serious reworking of the database, too... :-(

> - - Tag Admin logs in
> - - Tag Admin sees new submissions per tag, selects the tags he/she wants
> evaluated
> - - For each tag all package-tag combinations are evaluated via the AI
> tagger. Combinations > 0.80 are automatically set to approve, < 0.2 are
> automatically set to reject, rest are left for the Tag Admin to decide.

preselected to approve/reject? yeah, I think thats a good idea.

However, don't combine that too much with above plan: if the same AI
suggests tags and reviews them, that is more likely to be error prone,
actually. These should be done with different algorithms and/or
training data to avoid such effects.
When stuff is preselected, the admin is less likely to thoroughly
check them. A bad suggestion by the AI tagger with a high score might
sneak in this way again and again.

> Maybe I'm straying too far off? Thoughts?

We really need the database and admin/review work. I had hoped we
might get a second SoC project for these though. I'd really appreciate
it, but it might be a bit outside of your main focus, the AI. However,
some things might be needed by you anyway (e.g. some database) so it
might work out just fine.

best regards,
Erich Schubert
--
    erich@(mucl.de|debian.org)      --      GPG Key ID: 4B3A135C    (o_
  To understand recursion you first need to understand recursion.   //\
  Wo befreundete Wege zusammenlaufen, da sieht die ganze Welt für   V_/_
        eine Stunde wie eine Heimat aus. --- Herrmann Hesse