[Debtags-devel] A first use of the bayesian tagger
Enrico Zini
enrico at enricozini.org
Tue Oct 25 20:42:06 UTC 2005
Hello,
I needed to do a very tedious task: verifying the tag patch from the
packagebrowser for inclusion in the Packages file. It's a 1megabyte tag
patch. AArgh.
I needed a help. But it's so tedious that I can't ask anyone without
feeling guilty.
Who's good at tedious tasks? Computers are. So I need an artificial
intelligence, and we happen to have one.
Step one: make it work.
apt-get install libparse-debian-packages-perl dh-make-perl
dh-make-perl --cpan Heap::Priority --desc "Heap::Perl needed for Ben's AI tagger"
dpkg -i libheap-priority-perl_0.01-1_all.deb
Step two: training.
./create-data.pl --max-good=100 --bad-ratio=2 filetransfer::ftp
./bayesian-tagger.pl --train filetransfer::ftp enrico
Step three: giving it a try. Someone added filetransfer::ftp to
apt-howto-ca, and I don't agree, and someone added it to aria, which I
agree. What does ai-tagger thinks?
./bayesian-tagger.pl -p apt-howto-ca filetransfer::ftp
Package apt-howto-ca was categorized as unsure with a posterior to be good of 0.871862649929841
./bayesian-tagger.pl -p aria filetransfer::ftp
Package aria was categorized as good with a posterior to be good of 0.958824575032506
DUDE! You rock!
Let's see what happens if I keep for manual verifications only those
ones for which the ai-tagger is unsure...
<coding... coding... coding... clickety clickety click...>
This will need a bit of scripting, I'll followup with the results.
Ciao,
Enrico
--
GPG key: 1024D/797EBFAB 2000-12-05 Enrico Zini <enrico at enricozini.org>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
Url : http://lists.alioth.debian.org/pipermail/debtags-devel/attachments/20051025/f567e9d8/attachment.pgp
More information about the Debtags-devel
mailing list