yada yada, new_deb822.py

Adeodato Simó dato at net.com.org.es
Sat Aug 19 15:46:01 UTC 2006


Hey.

John, I've recently realized that deb822.py is unable to cover one of my
use cases, which is _fast_ iteration through e.g. unstable's Packages file.
This is expected, of course, since the implementation is pure Python,
but whilst I can see the advantages of having a pure python parser for
rfc822 files, I still need the speed.

Yesterday Jeroen van Wolffelaar pointed out to me that python-apt itself
has an interface to parse rfc822-like files, but a rather rudimentary one.
It is very fast, though.

Since speed is what I need, I'll be probably using something similar to
this in my application:

  http://people.debian.org/~adeodato/tmp/2006-08-19/new_deb822.py

A comparison on my system:

  % =time -p python -c 'import deb822 as deb822; [ None for x in deb822.Deb822.iter_paragraphs(file("/dev/shm/Packages"))]'
  real 13.76
  user 12.98
  sys 0.11

  % =time -p python -c 'import new_deb822 as deb822; [ None for x in deb822.Deb822.iter_paragraphs(file("/dev/shm/Packages"))]'
  real 0.98
  user 0.89
  sys 0.02

I want to work in my application for a bit now, but if you'd like, I can
put down some time to merge both implementations, so that apt_pkg is
used if available, and the Python one if not.

BTW, this are my pending branches against the old deb822.py:

  http://people.debian.org/~adeodato/code/branches/deb822/use_iterables/
  http://people.debian.org/~adeodato/code/branches/deb822/limit_fields/

Cheers,

-- 
Adeodato Simó                                     dato at net.com.org.es
Debian Developer                                  adeodato at debian.org
 
You cannot achieve the impossible without attempting the absurd.




More information about the pkg-python-debian-discuss mailing list