[Debtags-devel] spellchecking debian-packages
Mon, 6 Jun 2005 16:29:23 +0200
Content-Type: text/plain; charset=utf-8
On Mon, Jun 06, 2005 at 11:24:47AM +0100, Justin B Rye wrote:
> Hi; I'm interested in helping. I am not a programmer (indeed, I'm
> Google's top hit for "IANAP"), but I've been a Debian user and
> sysadmin for many years, and a quick look at the "debian-packages"
> file shows one obvious thing I can do - spelling patch attached.
Wow, thanks! I applied both patches.
I had a look at your page (mentioning IANAP and Google hits was a
deliberate strategy to lurk us there: admit it ;) and wow! Librarian,
linguist... Finally someone who may actually know anything of
categorization landed here! :)
Have you had a look at http://debtags.alioth.debian.org/paper-debtags.html ?
> I'm also strongly tempted to provide a patch imposing a standard
> capitalisation policy, and there are some other corrections I'd
> argue for (eg: there's no such thing as "X-Windows"), but I'm
> starting with something simple.=20
That's fine with me. I think the whole vocabulary entries didn't have
much proofreading at all, mainly because we are not sure about many of
them: if you see the "Status:" of the facets, 3 of them are marked
'complete' (although among them, 'culture' could be quite debatable); 9
are 'needing-review'; 13 are 'draft' and 4 are 'controversial'.
We are facing questions such as:
- the 'use::' facet is damn useful, but how do we define it really?
What should go in it and what should not?
- how do we categorise technologies? now I split them in different
facets (format, protocol, dbtech, hwtech, filetransfer), but that's
questionable (isn't filetransfer the same as protocols? aren't all of
these just the same aspect of a package (that is, the technology it
uses) and as such they should go in a single 'technology' facet?).
- what is a 'suite'? It's clearly useful to categorise applications
along what bigger whole they are a part of, but is 'apache' really a
suite? And what applications are really part of gnome? What goes in
the suite 'debian'? Don't we have a thousand more (perl, GNU R, GCC
and its various compilers...)?
- how do we handle facets that allow categorization with lots of tags?
> Incidentally, I don't see any mention in the archives of
> /usr/lib/menu or /usr/share/doc-base files, which each implement
> "section" hierarchies slightly divergent from the old system of
> repository sections. I hope they aren't being overlooked.
Uhm, well, aehm, they were in fact overlooked, but you mentioning them
now made them not being overlooked anymore ;)
How do we handle them? Two possibilities I see (more can be figured
- Directly map them into some of our tags
- Use them as heuristic data and implement some strategy in autodebtag
to deduce some tag from them.
Justin, welcome in!
GPG key: 1024D/797EBFAB 2000-12-05 Enrico Zini <email@example.com>
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: Digital signature
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
-----END PGP SIGNATURE-----