[Debtags-devel] Recent progress

Erich Schubert Erich Schubert <erich.schubert@gmail.com>
Sat, 12 Feb 2005 17:35:52 -0800


Hi.
I've commited code to SVN (debtags/central-database/c-bdb) that loads
the packages and vocabulary files - tag assignments to follow next -
into a BerkeleyDB database for high-speed access.

I just ran db_stats to gather some statistics.
Stored in a b-tree, the packages list is a 3-level database with 15689
keys currently.
Fill rate of the data pages is around 45%, so we have about a 2.5 fold
increase in storage to achieve the expected speed gains.
Vocabulary is a 2-level bdb, with 439 keys and a 30% fill ratio.
Total size for these two data sets is 15 MB, opposed to 13 MB of raw
data (note: especially in the packages file, more than half the raw
data is discarded!)
The corresponding mySQL tables take about 7 MB.

So without tuning this is a lot of wasted memory; but this certainly
is okay for speeding up the web application.

Greetings,
Erich Schubert
--
    erich@(mucl.de|debian.org)      --      GPG Key ID: 4B3A135C    (o_
  To understand recursion you first need to understand recursion.   //\
  Wo befreundete Wege zusammenlaufen, da sieht die ganze Welt f=FCr   V_/_
        eine Stunde wie eine Heimat aus. --- Herrmann Hesse