[Debtags-devel] AI Tagger

Enrico Zini enrico at enricozini.org
Fri Aug 19 20:19:26 UTC 2005

On Tue, Aug 16, 2005 at 09:09:43PM +0200, Benjamin Mesing wrote:

> > About giving it a try, I'd like to.  What I'd like to do is to run it on
> > the commandline, get a tag patch out, review it in the new debtags-edit
> > patch reviewer[1] and commit what's good of it.
> Yeah, this sounds like a good integration. Do you have a specification
> for the patch system at hand? Or should I use the  perl interface of
> libdebtags (which will probably mean more effort, but perhaps I should
> switch to use apt-front anyways...)? 

The patches I use have a really easy format:

package1: +tag1, +tag2, -tag3, -tag4
package2: +tag1, +tag2, -tag3, -tag4
package3: +tag1, +tag2, -tag3, -tag4

It's the same you get in ~/.debtags/patch after using debtags-edit, or
you can generate it with 'tagcoll diff', apply with
  tagcoll --patch-with=patchfile copy tagdbfile
and submit to the central repository with 'debtags submit patchfile'.

> > Can you show me the two example commandlines?  I can take care of
> > generating the tag patch from the output.
> Ok, here is what you can do right now:
>      1. ./create-data.pl use::editing --max-good=100 --bad-ratio=2
>      2. ./bayesian-tagger.pl use::editing
>      3. ./bayesian-tagger.pl --test-package kwrite
> The first step will create training and testdata. 
> The second will train and test with the created data, and print some
> statistics. 
> The third will test the package kwrite.

Good.  This is holy documentation to me.

I'm not able to look into it right now, as I'm in the middle of a big
refactoring of libtagcoll1: I discovered that in C++ you really don't
want to do overloading of virtual functions, and I've started a
ground-up cleanup to get rid of that, hitting some other nails while I'm
in the way.

But I'll get back to the bayesian tagging, because I was having some fun
with the autotaggers before being caught with the coding and I'd like to
go back to that.



GPG key: 1024D/797EBFAB 2000-12-05 Enrico Zini <enrico at enricozini.org>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
Url : http://lists.alioth.debian.org/pipermail/debtags-devel/attachments/20050819/72c0740a/attachment.pgp

More information about the Debtags-devel mailing list