[Debtags-devel] New web interface

Benjamin Mesing bensmail at gmx.net
Mon Nov 14 19:20:24 UTC 2005


Hello,

> > I think the idea is a good one - it might even work in practice :-).
> > I've tried a search starting with "package", "email" and "browser".
> > Package was a total failure (I wanted tools dealing with packages).
> 
> Try it now.  The search was case-sensitive, now I have made it
> insensitive.  This, and the preseeding of unwanted tags suggested by
> mornfall, gives nice fine results.
Searching for "package" still does not give me what I want ("package
management" does no better). See the my "scoring" annotations for a
proposal what might help.


> > Email required me to remove 5 of the 6 suggested tags (though the one
> > left was works-with::email which is good). Browser gave me some good
> > results (I only had to remove GTK-UI ;-).
> 
> [email] Yes, we have lots of email stuff like small libraries and tiny
> tools.  I've added made-of::* to the list of tags that don't concur in
> populating the initial tag selection: this should make things a bit
> better.
Well the initial tagset still is not a good hit. But then, looking
through the packages returned for a search for "email" shows why the
problem is so compliated. For example phpgroupware-* appears 43 times in
the search result and might move the search results far into this
direction. However scoring might give a little help here too.

> [browser] I've added uitoolkit::* to the list of tags that don't concur
> in populating the initial tag selection as well.
Don't know what your are talking about :-) What do you mean with "don't
concur in population the initial tag selection"? Do you leave those out
(i.e. never initially select?)


> > However I think the results could be improved by performing some kind of
> > scoring on the search results received by the full text search. Perhaps
> > adding tags which match the search expression, regardless wether they
> > appear often in the search results would be helpful too.
> 
> The idea is interesting, but I'm not sure how to implement it.  Suppose
> that someone looks for "debugger interface": then interface would match
> practically all of the uitoolkit tags, adding them all to the 'wanted'
> list, and giving an empty result (one package rarely has more than one
> uitoolkit tag).
You are right, I have no real solution for this.


> You can do it yourself even:
I'd like to try some more, but I am so short in time recently :-(


> > Additionally I think that it must be possible to add tags which are
> > not proposed at one location (without bloating the UI to much there
> > could be a combobox, with autocompletion).
> 
> This would be nice, although it's complex to implement in a web
> interface.  It would actually be interesting to have such a widget ready
> (possibly with autocompletion working on the facet short description +
>  tag short description rather than the name, since the name should
>  eventually be hidden in favour of the short description).
I am not a real web hacker - because this is a real pain-in-the-ass :-)
Some really important facets are "use", "role" and "interface", which in
my opinion are the most important ones for the average user. Especially
always showing the "role::application" on the right side might help a
lot of users to make the search result more meaningful. This is because
I think the main use-case for searching is the: "I want a program with
which I can do XYZ".



> > I think this does also provide a really good approach for a real search
> > application (without the limitations of a web interface). Lets see if I
> > can use some of those ideas :-))
> 
> Do it!  Do it!
Again I'd love to, but I have very little time :-( But I'll take a look
once I can spare some hours.


The Scoring Proposal:
Currently packagesearch scores the apt results based on the search
expression entered. Try it out, to see how this scoring performs. The
scoring algorithm is quite simple (please ask for more details).
A similar scoring could be applied for the initial package set. Then
instead of simply counting the occurences of the tags, they could be
weighted by the package score. The resulting values are then
accumulated. This would give supposedly better matching packages a
higher influence on the tags initially choosen.
(Example search for <package management> gives 376 packages, with
kpackage, equivs and synaptic being along the top 10, tough gnome-apt
and checkinstall are much further down).
You could also weight by the expected relevance of the tag (though we
would need a manual scoring of tag importance for this).

Some other random thoughts (not specially for Enricos webinterface, only
to write down what is flowing to my mind in case someone might find it
useful):
Defining some facets where most often only one tag is assigned (e.g.
interface::, uitoolkit::, made-of::). For those choose at most one of
the tags (having interface::x11 && interface::commandline gives a small
result of packages).

Allow a two field fulltext search, where in one field a main search
expression is given, and in the other a number of additional expressions
can be entered.
E.g. [email] [client]
E.g. [browser] []
E.g. [debugger] [interface]
But this might lead to a: choose a use::* tag first and continue
afterwards.

Best regards 

Ben







More information about the Debtags-devel mailing list