Questions regarding "Smart Search" and Tagging via the Webinterface

Enrico Zini enrico at enricozini.org
Wed Apr 8 11:12:06 UTC 2009


On Mon, Apr 06, 2009 at 06:32:04PM +0200, Benjamin Mesing wrote:

> in preparation of a presentation I will be holding about Debtags I would
> like to ask some (further) questions:
>  1. How does the smart search operate? One first provides a seed for
>     a full-text search. Then the tag (sometimes multiple tags are
>     selected here) occuring most often is choosen as wanted tag.
>     That's the initial tag-set that is searched for. Entering a new
>     search pattern produces a new set of "Available" tags. Does this
>     search only the results of the previous search?

First, the keywords you enter are used to run a fill-text search on
*packages*: the tags that you see are the tags of the top packages
resulting from the full-text search.  This allows us to perform what
looks like a full-text search on tags by exploiting the fact that
the package database has way more text data to run a full-text search on
than just the tag database.

In 'debtags smartsearch', every time you type keywords you're just
generating a new set of tags to choose.  The search results depend on
what tags you have chosen.

The same mechanism is used in the tag editor, when you pick the
Available tags / Search function.  It can also be used to generate a
context-sensitive tag cloud during a package search: you can see that
implemented at http://debtags.debian.net/dde/q/axi/cquery except that
currently it does not work and I cannot fix it because it seems to be a
problem in UDD.

>  2. For the tag-editor (web), how are the suggested tags computed?
>     By AI-methods?

Same as the smart search.  Specifically, it uses Xapian: first it does a
full text search on the packages, then it asks Xapian what are the most
significant tags for the top resulting packages.  Xapian is extremely
fast in computing that.

>  3. On 26th September Enrico spoke about the presented usage of
>     debtags for CDD-package selection at Firenze World Vision event.
>     On 17th May 2005 Holger Levsen proposed the usage of tags for
>     CDDs. Are you aware of any CDDs that are actually using Tags for
>     their purpose, especially for handling the set of packages in
>     the CDD? 

No, I'm not.  As far as I know, mostly metapackages are used.  Years ago
we played with turning metapackage dependencies into tags, but not the
other way round.  It wouldn't be hard to do, but so far there hasn't
really been a point to it: dependencies are more expressive, as in a
metapackage you could distinguish depends, recommends and suggests,
which you cannot do with tags.

>  4. On 2nd March 2006 Enrico mentioned a discussion he had about
>     debtags replacing tasksel. Is this currently pursued in any way?

Not as far as I know.  And it does not seem to be terribly important,
nor useful: hand-crafted tasks are probably of better quality than
things autogenerated with tags, because while tagging one is not
necessarily thinking of tasks as the main use case.

Tags can of course be extremely useful to look for packages to add to
hand-crafted tasks.


Ciao,

Enrico

-- 
GPG key: 1024D/797EBFAB 2000-12-05 Enrico Zini <enrico at debian.org>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 835 bytes
Desc: Digital signature
URL: <http://lists.alioth.debian.org/pipermail/debtags-devel/attachments/20090408/75b0f379/attachment.pgp>


More information about the Debtags-devel mailing list