[Debtags-devel] AI-Tagging further steps

Thaddeus H. Black t@b-tk.org
Fri, 29 Oct 2004 13:00:57 +0000


--/9DWx/yDrRhgMJTb
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

Ben writes,

>         ~/lang/perl/autodebtag> ./bayesian-tagger.pl data__font
>         Tested packages: 30
>         Matches: 30 ^=3D 1
>         Mismatches: 0 ^=3D 0

This tends to prove the Bayesian concept, doesn't it?

> ...  Another thing to keep in mind
> is that false positives should be better than false negatives, as false
> positives will show irrelevant results to the user, but false negatives
> will hide will hide relevant packages from the user.

A general debtags remark or two, not especially in response to Ben's
words:

The very nature of the debtags project is to resist automation in its
core activity: tagging packages.  Were it not so, Debian hackers would
surely have attacked the tagging problem years ago.  If coding is fun
but tagging is boring, then, well, ... I don't quite know what to say.
What do you say?  We cannot altogether code around this problem, I
think.  The code is very important.  It provides needed structure.  It
automates automatable tasks.  It helps a lot.  But the code is only part
of the answer.  The other part is as digging a mine in olden days.  We
take our shovels.  We heft our picks.  We dig through the hill, five
meters a day.

(It has been observed that hackers do not like to dig through the hill,
five meters a day.  We like giant diesel boring machines and dynamite,
don't we?  Maybe so.)

I tend to accept Ben's judgment in favor of false positives, although
one hopes that the false positives are meant to be shown to a human
editor (who can eliminate them) not directly to the end user.

--=20
Thad

--/9DWx/yDrRhgMJTb
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: For info see http://www.gnupg.org

iEYEARECAAYFAkGCPwkACgkQh3E0gzgBXn7xBACfaG1yrny7mzawZ29qjqXrrQJd
nG4AoL3cjHZAJLxWcFtNYw6PEQFh9WVX
=GYVO
-----END PGP SIGNATURE-----

--/9DWx/yDrRhgMJTb--