[Debtags-devel] AI Tagger

Enrico Zini enrico at enricozini.org
Fri Aug 19 20:26:20 UTC 2005


On Thu, Aug 18, 2005 at 03:24:44PM +0200, Benjamin Mesing wrote:

> ok, I've added a --test-all-packages option. Unfortunately there seems
> to be a memory leak :-( -- its not checked in yet. I was not able to tag
> all the packages. 
> I suspect it has something to do with the AptPkg library. The more
> packages I test, the more memory gets allocated.
> I have tried to undef the AptPkg::Cache instance in between, and create
> a fresh one, but this is counterproductive.
> Did anyone fiddle around with the Perl interface of AptPkg? Can anybody
> help?

I think that for your use, you can definitely do this:

my %packages;
sub read_packages ()
{
        my $in = new IO::File ("apt-cache dumpavail |");
        $in or error "Can't get package data: $!";

        local $/ = "\n\n";

        # Read input one record at a time
        while (<$in>)
        {
                my %data;
                my $pkgname;

                # Split the fields of the record
                for my $field (split(/\n(?![ \t])/))
                {
                        my ($name, $val) = split(/:\s*/, $field, 2);

                        # Trim spaces in the name
                        $name =~ s/^\s*(.+?)\s*$/$1/;
                        # Trim spaces in the value
                        $val =~ s/^\s*(.+?)\s*$/$1/;
                        # Normalize multiline values
                        $val =~ s/\s*\n\s*/\n/g;

                        if ($name eq 'Package')
                        {
                                $pkgname = $val;
                        } else {
                                $data{lc $name} = $val;
                        }
                }

                error "Record $. has no package name" if not defined $pkgname;

                $packages{$pkgname} = \%data;
        }

        $in->close();
}

It's like 20megs of memory needed to store that %packages, but then you
have all the data in there, fast, and it won't crash.  At the moment,
I'd go this way.


Ciao,

Enrico

--
GPG key: 1024D/797EBFAB 2000-12-05 Enrico Zini <enrico at enricozini.org>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
Url : http://lists.alioth.debian.org/pipermail/debtags-devel/attachments/20050819/82b73425/attachment.pgp


More information about the Debtags-devel mailing list