Debtags and Debram (Was: Re: New debtags suite just uploaded)

Thaddeus H. Black t@b-tk.org
Fri, 2 Jul 2004 15:29:41 +0000


--3uo+9/B/ebqu+fSQ
Content-Type: text/plain; charset=unknown-8bit
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

This post is somewhat lengthy.  Who reads long
posts?  I will keep it as short as I can.  At
bottom is a specific question to which I would
appreciate feedback from Enrico, Erich, Michael,
Herv=E9, Benjamin, Evan and any others who feel
that they may have sufficient knowledge to
speak.

Enrico wrote,

> I'm having a hard time understanding the
> structure of your categorization: I see you're
> using some important background in library
> science that I'm not knowledgeable of, and I'm
> even having trouble understanding the logic in
> which your hierarchy is structured.

I confess that I have no background in library
science.

A bit of debram history may serve to clarify the
situation.  The history begins in the potato era
with a rationale: namely that, like you and
everyone else on this list, I couldn't ever find
anything in Debian.  It was one giant heap of
four thousand packages.  This drove me crazy,
and since neither Ranganathan nor any other
skilled librarian actually seemed to be working
on solving the problem, in March 2002 I started
sorting the packages into piles, then dividing
the piles into smaller piles, then moving the
little piles around subjectively until they
attained an overall pattern that somewhat
followed the traditional Unix man sections and
otherwise made subjective sense to me.  Then I
labeled the piles and sorted them some more.
The process was entirely iterative.  I suppose
that by now I have attained a better overall
knowledge of the contents of the Debian archive
than all but a very few people in the world, but
I still do not know the first principle of
library theory.  I just sort stuff.

Then you debtags guys came along (or maybe you
were there before, but I didn't know it).  You
had some great ideas which I had never thought
of.  I was working mostly in isolation and, not
yet being on the DD keyring, I thought it best
just to lurk for a while, continuing to work on
the debram.  I did not know you, after all.
I had already done a lot of work on the debram.
I did not know if you were serious.  How could I
know?  Most people who post ideas on lists are
not serious.  Tagging the archive is a big
project.  I did not want to risk starting
discussions with you if you might just go MIA
six months later.

Well.  You guys are still here and I am pleased
to observe that you seem to be 100 percent
serious.  This is a relief to me.  The big heap
of packages has tripled in size since the potato
days, and it is still a BIG HEAP.  This is
surreal.  It is also too big for me to handle
alone.  I am pleased and relieved to lace up my
boots and join your army.

However, we have a temporary but real problem.
You started by defining tags to assign to
packages.  I started by sorting packages into
piles then naming the piles.  Your approach is
smarter, but under my approach the work is
nearly done.  And the two approaches are
apparently not mixing.  In chemical terms, what
we want is some detergent that makes debram
soluble in debtags.  We want debram to mix.

> And also, how do you build the hierarchy?  How
> do you decide that, for example, Audio is
> under "User oriented packages" but Graphics is
> under X/X Applications/ ?

Good question.  Looking at it from the debram's
perspective, can you see how it would happen
this way?  I did not design the hierarchy then
sort the packages into it; I sorted the packages
then built the hierarchy around them.  From this
perspective, [1800 X] emerges as a natural
division of the archive.  The [1900 Audio]
emerges as a natural division, too, but of a
different character because its various contents
have a unifying theme but lack a mutual set of
dependencies.  I didn't design it this way; this
is just how Debian has grown up.  In general,
the more Maintainers were working on a
particular kind of package or the more
fundamental the kind of package seemed to
general Debian operation, the more prominent a
place that kind of package earned in the tree.
Packages with similar purposes, similar
audiences and/or similar dependencies tend to go
together; but whether the purpose, the audience
or the dependency is the dominant consideration
varies over the hierarchy: sorting
considerations which work well
in [1300 Programming], for example, may not work
so well for [1800 X].  No geometrical precision
is possible in this.

The biggest single problem with the debram is
with some packages which strongly naturally
belong in each of two different branches.  For
various reasons, I wanted (and still want) each
package to have a single principal home on the
tree, but for some packages this approach has
created real problems.  The problems are
particularly acute within

  1580 CJK (Chinese/Japanese/Korean),
  1716 SGML / XML,
  1720 Mathematics,
  8151 Kernel Control and Management of
    Central Hardware,

and a few others.  The debtags solve this
problem, and do it in a way which does not
fundamentally require us to give up debram's
gains in this area.

> I'm hoping that, while understanding more of
> each other, merging ideas will come
> spontaneously.

They may.  I hope that they do.  One way or
another, the debram must be made to mix.

Unless someone has an alternate plan---and I do
not deny that a good alternate plan exists; my
ears are open---I see little else to do at the
exact moment but to finish ramifying the debram
as intended: complete through sarge.  It is hard
to take concrete steps to merge debram into
debtags until debram has a complete, checked
body of tags to merge.  If this is wrong in your
view, please tell me so; and if possible please
suggest a workable alternative.

QUESTION

If this is correct, however, then here is my
question: do we all now feel that the debtags
stand on sufficiently firm ground, that it would
be safe to abandon debram development after
completion through sarge?  Continuing to develop
two incompatible tagging systems in parallel
seems a waste of scarce effort: apparently we
all agree on this.  I want to develop debtags
not debram, yet now I am spending all my Debian
time on debram.  The question is, when and how
can I stop?  After two years of steady
development, the debram has some momentum; it is
not so easy to stop.

I understand that some readers of this post will
not yet actually have seen the debram.  This is
okay, but if you are following the debtags
development, then you probably want to go to see
debram now---if only to understand the present
dilemma.  If you have a sid or sarge setup, it
will not take much of your time.  Fetch debram
and debram-data (there is nothing else, just
these two), install them (they depend only on
the standard libs), read the first few lines of
the man page and go.  Twenty minutes later you
will have gained a fairly good idea of what the
debram is all about, and I think that you will
begin to understand the present issue.  Maybe
some pertinent thoughts on how to accomplish the
transition from debram to debtags will occur to
you in the process.  If so, then your advice
will be well appreciated, because I need some
help to bridge this gap.

--3uo+9/B/ebqu+fSQ
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: For info see http://www.gnupg.org

iD8DBQFA5X9lh3E0gzgBXn4RAlkCAKCyRxq3LSGyEJWCLht1nU8E/kn2XQCeLOuR
3L1ddDcAMiotJJLxINJqk2g=
=BUYq
-----END PGP SIGNATURE-----

--3uo+9/B/ebqu+fSQ--