[Debtags-devel] Using tags for aggregating popcon data

Peter Rockai (mornfall) mornfall@danill.sk
Mon, 6 Sep 2004 20:55:46 +0200


On Sunday 05 September 2004 23:40, Enrico Zini wrote:
> Hello,
Hello
>
> while messing around with autodebtag I started thinking that it could be
> used to infer tags from special data sources, for example popcon data.
>
> While the popcon database has numeric data, it can be aggregated in,
> say, 5 or 7 "frequency" steps corresponding to tags in a specific facet.
> For example:
>   popcon::unused	0-5
>   popcon::rare		5-20
>   popcon::seldom	20-100
>   popcon::common	100-500
>   popcon::everywhere	500+
I would probably prefer using installs/popcon-users ratio, which would allow 
the values to be invariant to number of popcon users.
>
> This opens an interesting door to easily integrate into existing package
> tools (well, where existing means "that use debtags") informations of
> completely new kind.
>
> Now, this wouldn't probably work for popcon (for example, I would seldom
> want to see just the packages in one category, and it's more likely that
> I want something like everything bigger or equal than "seldom").
Which leads us to extending the tag idea a bit further... Making a facet typed 
and the tag itself could then be an integer or an enumeration (and of course 
a plain id as it is currently). Just a wild idea, not sure if it's of any 
use ;).
>
> However, I wanted to spawn some thinking: do you see other useful
> applications of this idea?  How to get data, for example, from the BTS?
> Other possible data sources?
Hmm, dependency data? Not sure i did not see this suggestion already 
somewhere. But depends: kdelibs4 is a good indicator of package being 
kde-based (OTOH data inferred this way may need to be a subject of moderation 
by a human, since mistakes/dezinformation may be introduced this way).
>
> Ciao,
> Enrico
Yours,
    Peter
>
> --
> GPG key: 1024D/797EBFAB 2000-12-05 Enrico Zini <enrico@debian.org>