[Debtorrent-devel] Fwd: BitTorrent Protocol Expansion (Google SoC)

Cameron Dale camrdale at gmail.com
Fri Apr 13 01:48:22 UTC 2007


---------- Forwarded message ----------
From: Anthony Towns <aj at azure.humbug.org.au>
Date: Apr 2, 2007 1:57 AM
Subject: Re: BitTorrent Protocol Expansion (Google SoC)
To: Cameron Dale <camrdale at gmail.com>


On Sun, Apr 01, 2007 at 02:19:27PM -0700, Cameron Dale wrote:
> http://wiki.debian.org/AptBittorrent

Sweet. Some comments:

> "a lot of packages are too small"

I think I did some stats a while ago trying to get a handle on this
to work out piece sizes. No idea what I did with the data then, but
redoing it now seems straightforward. If I use /var/lib/dpkg/available
(which is in Packages file format):

    $ sed < /var/lib/dpkg/available  -ne 's/^Size: //p' | sort -n > foo.csv

and run that through gnumeric's "statistical analysis" stuff, I get:

        Mean                               757,299.01
        Standard Error                      26,260.22
        Median                              94,697.00
        Mode                                   792.00
        Standard Deviation               3,546,976.60
        Sample Variance         12,581,043,007,634.50
        Kurtosis                               633.33
        Skewness                                19.95
        Range                          161,312,492.00
        Minimum                                736.00
        Maximum                        161,313,228.00
        Sum                         13,816,163,106.00
        Count                               18,244.00

        95% CI for the Mean from        705,826.52
        to                              808,771.50

A mean package size of 757kB with a std-dev of 3MB is probably noteworthy;
the minimum size of 736 bytes compared to a maximum of 161MB is probably
likewise interesting.

> "Proposal A"
> "communicate all torrent information in the Packages file"
> "pieces can no longer be numbered"

The latter isn't actually true -- if you have a Packages file like:

        Package: foo
        Size: 5341873
        SHA1: 38170c08cb458fd4879c34b6f608294c50312bbb
        SHA1-pieces:
         e5fa44f2b31c1fb553b6021e7360d07d5d91ff5e 1048576
         7448d8798a4380162d4b56f9b452e2f6f9e24e7a 1048576
         a3db5c13ff90a36963278c6a39e4ee3c22e2a436 1048576
         9c6b057a2b9d96a4067a749ee3b3b0158d390cf1 1048576
         5d9474c0309b7ca09a182d888f73b37a8fe1362c 1048576
         ccf271b7830882da1791852baeca1737fcbe4b90 98993

        Package: bar
        Size 72856
        SHA1: 9425fa8de16f6283365f6bee87f405da16a203e6

then you have 7 pieces all up, five of size 1048576, one of size 98993
and one of size 72856, and you can number them in order, ie:

        0 -> foo[0]
        1 -> foo[1]
        2 -> foo[2]
        3 -> foo[3]
        4 -> foo[4]
        5 -> foo[5]
        6 -> bar[0]

You're depending on your Packages file being in the same order on
different hosts, but that's more or less ok anyway. The major thing
that changes in that scenario is that _all_ the pieces can be "short",
rather than just the last.

> "Cons"
> "...difficult to find rare pieces"

A simpler approach might be to communicate "I'm planning on downloading
the entire torrent" or "I have downloaded the entire torrent", and
prioritise those peers. We have a bunch of well-connected mirrors around
already and I wouldn't expect that to change, so there's no reason not
to make use of it. And we have lots of people who have a full mirror
for their architecture(s) too who would participate in a p2p scheme,
so if you had a bit to flag those hosts, you'd probably be pretty okay.

Another approach is to have the existing mirror network act as a
backchannel, so that if you can't download foo.deb from any peers in
reasonable time, you grab it from a regular http mirror instead.

> "how do peers communicate BITFIELD information of all the pieces they
>  have when the pieces are no longer numbered"

Probably by sending a lot of "HAVE ..." notices?

Cheers,
aj


-----BEGIN PGP SIGNATURE-----

iD8DBQFGEMV2Oxe8dCpOPqoRAuD0AKCo4/2VeYGD2L68A2RuyeteyiRvWgCeIIy8
qndEMf7g91yL7axwW4c71I0=
=Fu6f
-----END PGP SIGNATURE-----



More information about the Debtorrent-devel mailing list