[Debtags-devel] Tags and facets

Enrico Zini enrico@enricozini.org
Mon, 7 Mar 2005 16:43:43 +0100


--k+w/mQv8wyuph6w0
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline

Hello,

Cool discussion!

Which made me think: so far, the vocabulary has been created by looking
at the packages we have and trying to infer qualities from them.  This
was good to get an understanding of the problem, but has shown
shortcomings: the body of packages we have is too chaotic, and the
vocabulary will always be an undefined thing only relatively useful for
people.

In these days, however, one of my (concurrent!) jobs is to teach about
basic design things to some high school students, and I've been talking
a bit about user's goals.

So the two things got together: a way to make the vocabulary both well
defined and useful, is to intend the facet as something identifying "a
kind of usage or need", and the tags as "what one looks for in that
context".

The implemented-in and langdevel tags are perfect in this respect.  The
security tag can be made quite perfect easily.  They both identify a
very clear domain:

  implemented-in: "I want to find examples of software implemented in a
                   given language"
  langdevel:      "I want to implement in a given language: what can help
                   me?"
  security:       "I want to make my computer more secure: what can help
                   me?"

Other facets are more problematic because there is no such a clear
question behind them.

So, this would shift the focus from understanding the packages, to
understanding needs.  Which sounds much more promising, and has the
potential for finding more useful categorization.


Other people, however, have wondered about other uses of tagging:
tagging how a package is maintained ("orphaned-upstream,
has-critical-bugs", ...), for example.  Facets in this case are
essential to keep such other utility tags under a specific namespace.


Given all these interpretations of tagging, the most interesting option
is to leave it open for various groups to create facets and share them.


The way tags are edited could change as well: tags in facets such as
implemented-in and langdevel could be filled in by ordinary developers;
tags in facets such as 'security' or 'field' could instead be filled by
experts, who know the field and the software used in it.


Debtags can already download vocabularies and data from different
sources, but Erich's website and debtags-edit only support editing one
specific data source: our central tag vocabulary.


That's one thing to work at.  The other, is to design an interface which
can present the user with a reduced number of facets.

These, and other like timestamping the tags, using RDF and semantic web,
bayesian tagging over package metadata and documentation as well.

We are building quite some theory here, but it's mainly in our heads:
when and how are we starting to put it down?


Ciao,

Enrico

--
GPG key: 1024D/797EBFAB 2000-12-05 Enrico Zini <enrico@enricozini.org>

--k+w/mQv8wyuph6w0
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: Digital signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.0 (GNU/Linux)

iD8DBQFCLHav9LSwzHl+v6sRAkfRAJwNJebHwNTwLpxUgG1GQsvZWtTDgwCfenwW
QY/MHpDSk2Lb7m+/kHSQIEc=
=ISYr
-----END PGP SIGNATURE-----

--k+w/mQv8wyuph6w0--