[Debtorrent-devel] Fwd: BitTorrent Protocol Expansion (Google SoC)

Cameron Dale camrdale at gmail.com
Fri Apr 13 01:52:26 UTC 2007


---------- Forwarded message ----------
From: Anthony Towns <aj at azure.humbug.org.au>
Date: Apr 9, 2007 8:19 PM
Subject: Re: BitTorrent Protocol Expansion (Google SoC)
To: Cameron Dale <camrdale at gmail.com>


On Mon, Apr 09, 2007 at 02:43:16PM -0700, Cameron Dale wrote:
> Since the first step is variable size pieces, I'm wondering, are we sure this is
> the way to go? What about my alternate suggestion on the wiki for a smart
> ordering of pieces that would minimize the wasted bandwidth? Maybe we should
> generate some statistics on this to check?

lenny's active, so monitoring sid and lenny for a week should be a good
start. The size of all debs in a Packages file (arch:i386 and arch:all)
is 14,337,091,622B by my count, so around 14GB. That should give you a start,
afaics.

The unavoidable drawback, afaics, is that piece boundaries will almost
never start on a file boundary, so there's no way to reuse the sha1 info
from the Packages file as the "piece" checksum info. Presumably we're
talking 1MB piece sizes, so that's an extra 14k sha1sums that need
to be calculated and distributed every pulse, by sha1'ing the entire
archive and adding a new 14k*20B so an extra 280kB per suite. But it's
the regenerating sha1s for something like 100GB of the archive twice
(or more) a day that sounds impossible to me.

> And what about the choice of BitTornado as a starting client to modify? I was
> looking through the long bug list for it, and I see a couple of things that
> might be a problem. First, it's UPnP support seems to be Windows based, and so
> not functional.

Oh cool. Well, I presume that's fixable, and if it's not, I don't really think
manually adding some port forwarding is a big deal. (I hadn't even considered
UPnP as an option to be honest)

> The second problem that comes up a lot with BitTornado is its complete lack of
> support for encodings such as unicode and utf8. Do you foresee any problems with
> the file names that we'll be dealing with? Is there some kind of standard
> detailing the encoding that Debian archives use for their files?

Package names are limited to '[a-z0-9][a-z0-9+.-]+', versions
are a subset of '[0-9a-zA-Z.+:-]+'. Filenames are limited to
'<package>_<version>_<arch>.<extension>'. The ":" could conceivably be
re-encoded as per HTTP as "%3a" (that's what apt does), but currently
isn't used.

http://ftp.debian.org/debian/indices/files/components/ has some lists of
files on the mirrors that might give you a better idea.

Cheers,
aj


-----BEGIN PGP SIGNATURE-----

iD8DBQFGGwJPOxe8dCpOPqoRAg4FAJ40aYcACn9bsFo75rl3FY0ykjR3lwCghHpg
Ncb/xvAGSpl18IGIhvBRIX8=
=BU+l
-----END PGP SIGNATURE-----



More information about the Debtorrent-devel mailing list