[Debtags-devel] Debtags weekly news

Enrico Zini enrico@enricozini.org
Mon, 18 Oct 2004 22:23:29 +0200


--GvXjxJ+pjyke8COw
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline

Hello!

Debtags weekly news
-------------------

I just implemented a 'stats' function in autodebtag, which finally
produces some interesting stats about the debtags effort.

Here's the results as of today.  If it wasn't clear enough by the layout
of this mail, I'm considering making a weekly or bi-weekly newsletter
with the results of the stats, plus some news about the Debtags
development.

I'm really happy to launch this test newsletter announcing that
bayesian-tagger.pl and create-training-set.pl by Benjamin Mesing hit the
subversion archive in the autodebtag part of the repository, opening the
way for the new frontier of bayesian automated tagging!

I'm also happy to announce the launch of libdebtags-perl, available in
subversion as:
	svn+ssh://alioth.debian.org/svn/debtags/libdebtags-perl
I had to implement a Set perl class to handle tag sets.  Python has a
Set class already, so it should be very easy to implement a
python-debtags as well: volunteers are welcome!

Lastly, I implemented "autodebtag debramcheck", that tries to compare
debram and debtags ideas of package groups and to show where are the
differences.  The result seems to be very useful in spotting areas in
which categorization need more help.

Please give some feedback on this newsletter: do we want it?  In
debtags-announce?  In debian-devel-announce?  Weekly or bi-weekly?
Weekly in debtags-devel, by-weekly in debtags-announce, monthly in
debian-devel-announce?  Other ideas for content?  Other ideas for stats?
Other ideas?


Debtags report
--------------

Date: Mon Oct 18 21:56:22 2004

15686 packages, 4977 not yet tagged, 0 completely tagged

Top ten facets by cardinality:
   10249: special
    4280: devel
    3925: <legacy>
    2775: role
    2495: uitoolkit
    2091: use
    1980: langdevel
    1626: suite
    1492: media
     961: interface

Top ten tags by cardinality:
    4977: special::not-yet-tagged
    3286: devel::library
    1231: uitoolkit::gtk
    1226: special::not-yet-tagged::l
    1077: role::utility
     740: role::doc
     694: langdevel::perl
     661: netcomm
     661: net
     649: uitoolkit::qt

Top ten tag sets by cardinality:
    1168: devel::library
    1131: special::not-yet-tagged, special::not-yet-tagged::l
     538: special::not-yet-tagged, special::not-yet-tagged::p
     291: special::not-yet-tagged, special::not-yet-tagged::k
     276: special::not-yet-tagged, special::not-yet-tagged::s
     263: devel::library, langdevel::perl
     240: special::not-yet-tagged, special::not-yet-tagged::m
     203: role::utility
     197: special::not-yet-tagged, special::not-yet-tagged::g
     196: devel::library, langdevel::python

3 random untagged packages: expect5.31, libinti1.0-1.2, pure-ftpd-common


Ciao,

Enrico

--
GPG key: 1024D/797EBFAB 2000-12-05 Enrico Zini <enrico@debian.org>

--GvXjxJ+pjyke8COw
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: Digital signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (GNU/Linux)

iD8DBQFBdCZB9LSwzHl+v6sRAqRlAJwIodQ2X0OaxiiNYBvC1ZhPfNq/dACdHjiV
CGiRH+KOWIKNBu8SYxWuO+w=
=dFx6
-----END PGP SIGNATURE-----

--GvXjxJ+pjyke8COw--