Announcing Debian Package Tags

Javier Fernández-Sanguino Peña jfs@computer.org
Mon, 28 Apr 2003 23:38:26 +0200


--UugvWAfsgieZRqgk
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Mon, Apr 28, 2003 at 09:38:31PM +0200, Enrico Zini wrote:
> On Mon, Apr 28, 2003 at 12:30:54PM -0400, Colin Walters wrote:
>=20
> > 2) Do you forsee tags being maintained outside of the packages in the
> > future?  For developing the tag system this makes sense, but it seems to
> > me that maintainers should have more direct control over this somehow.
>=20
> We've been thinking about having the maintainers responsible for the
> tagging of their packages, with the risk of ending up with untagged or
(..)
>=20
> This is another of the issues that need some more in-field experience to
> see what design decisions should be taken next.

Sorry to mention it again, but IMHO, if you want to be able to manage tags=
=20
with packages constantly getting included in the archive you need an=20
automated mechanism to help you. That's were document clustering and=20
keyword abstraction might help, and that's were bow (the 'bag of words'=20
library) might help.

The packages descriptions make a good candidate for document clustering in=
=20
the sense that you can determine if packages are 'related' to others by=20
comparing descriptions. And you can even extract valuable keywords from=20
packages by using techniques such as TFIDF. This method is also=20
(mostly) language-agnostic so it can be useful for DDTP descriptions also.

Regards

Javi

--UugvWAfsgieZRqgk
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.1 (GNU/Linux)

iD8DBQE+rZ9SsandgtyBSwkRAoYdAJ9vZujJiTRD5qLwV7DrK0ntEaQ8qgCfUScK
InNgVv80PzA7H3098YbsuJM=
=rpOO
-----END PGP SIGNATURE-----

--UugvWAfsgieZRqgk--