Go Alex Go!

Erich Schubert erich.schubert at gmail.com
Tue Jul 11 22:32:29 UTC 2006


Hi Alex,
When browsing the "mail" part of your TagDB web interface, I noticed
that using itemset minig could be quite useful, actually. Especially
since it's somewhat orthogonal to naive bayes - naive bayes completely
ignores "correlation" between tags; itemset mining would only use
correlation between tags.
In the "unapproved" changes, mail::mail-transfer-agent was replaced by
mail::mail-user-agent for squirrelmail. Which obviously is good.
However, these changes are "correlated". There probably is not a
single package which has both the mua and mta tags. This information
could be useful for the AI tagger, too.
I'm just not sure yet that we need "itemset mining", or maybe we are
better of with some "traditional" correlation calculations. I.e. for
any two tags a, b calculate the likelihoods P(a | b) and P(b | a) and
somehow use these in the AI.
Yeah, we're probably better of with the latter... itemset mining would
be more useful for finding important tag groups, maybe redundancies,
implications etc. that extend to more than two tags.

best regards,
Erich Schubert
--
    erich@(mucl.de|debian.org)      --      GPG Key ID: 4B3A135C    (o_
  To understand recursion you first need to understand recursion.   //\
  Wo befreundete Wege zusammenlaufen, da sieht die ganze Welt für   V_/_
        eine Stunde wie eine Heimat aus. --- Herrmann Hesse



More information about the Debtags-devel mailing list