[Debtorrent-devel] Fwd: BitTorrent Protocol Expansion (Google SoC)

Cameron Dale camrdale at gmail.com
Fri Apr 13 01:50:02 UTC 2007


---------- Forwarded message ----------
From: Anthony Towns <aj at azure.humbug.org.au>
Date: Apr 6, 2007 6:40 PM
Subject: Re: BitTorrent Protocol Expansion (Google SoC)
To: Cameron Dale <camrdale at gmail.com>


On Fri, Apr 06, 2007 at 11:08:29AM -0700, Cameron Dale wrote:
> My idea for this proposal is that you don't have a new torrent every day,
> instead you have a single torrent for every suite and architecture (separate
> all) combination. From the proposal: "create a torrent for every combination of
> suite (stable/testing/unstable) and architecture, including separate ones for
> architecture:all and source." That torrent stays the same, even though the
> Packages file changes every day, and new pieces get added. This allows you to
> maintain communication between peers who have Packages files from different days.

Is that necessarily desireable? Wouldn't it cause problems if you're on
the i'th Packages file, but most of the peers in the torrent are on the
j'th, so when you connect to a random peer, and try to download a piece
of a package that been updated between i and j, they're unlikely to have
it (or want it)?

> If you have a new torrent file each day, then in order for me, who downloaded
> the Packages file today, to share pieces with you, who downloaded it yesterday,
> I would have to participate in both torrents.

Well, don't forget that we're updating the Packages file twice a day
at the moment, and expecting that to increase. Ubuntu does it (up to)
48 times a day, iirc, and we'd certainly like to consider between 4 and
12 times a day.

> Add in people who downloaded the
> Packages file up to 100 days ago, and I'm now running 100 torrents just to share
> with all of them. And I can now contact you through 99 of those torrents, since
> you're also running them all. It's getting complicated.

I'm not sure 100 days ago is interesting anyway though. Presumably you can
expect to mostly have a few different likely frequencies for testing/unstable
users:

        * obsessive-compulsive updating of every change, ASAP
        * once every 12, 24 or 48 hours
        * once or twice a week

I wouldn't really expect it to be all that interesting to worry much beyond
that frequency -- you're not going to have enough peers that low for it to be
interesting, afaics.

But even so, at once a week with 12 times a day, you're potentially up
to 84 different Packages files anyway (though currently no more than 14).

Another aspect: if you're trying to share amongst all those peers, you don't
actually have to participate in all the torrents. You can do it indirectly
instead:

             torrent a           torrent b
Peer 1           Y
Peer 2           Y                  Y
Peer 3                              Y

So if Peer 1 has a piece that's common across both torrents, Peer 2 will
get it via torrent a, then be able to share it with Peer 3 via torrent b.

> Whether you call it a torrent or something else, you need a giant swarm of peers
> all talking to each other, no matter what day they downloaded the Packages file
> on, since they will have something like 90% of the pieces in common. And when
> one says to another, I have package foo, they need to know they're talking about
> the same version. Since peers within a torrent only exchange information in the
> form of piece numbers, these need to uniquely identify a package AND version.

Treating the path in the pool (pool/main/g/gamin/libgamin0_0.1.7-4_powerpc.deb)
as unique-per-file should be fine for that in almost all cases, fwiw.

> > There's that, which I definitely agree on, and there's also that it
> > might make implementation easier, since piece numbering is presumably
> > a pretty fundamental assumption, that it'll be awkward to break.
> I think it's possible to do though. Instead of storing an array of pieces you
> have, you would just store a dictionary of hash values. But, as I said, bitfield
> may prove necessary.

Storing hash values like that can take up a lot of memory is all;
there are a lot of pieces, and if you're not storing it in some compact
way... Same problem as the BITFIELD versus HAVE messages, really, just
more local.

> These numbers definitely seem like it would be possible to keep unique piece
> numbers for long periods of time before reuse. I think the release of a new
> version might be the perfect time to start fresh from piece number 0. It's
> probably possible to keep unique piece numbers for a given release (say 'etch')
> all through the release's lifetime without ever having to reuse piece numbers,
> especially since it will need very few new piece numbers once it becomes a
> stable release.

The overall lifespan of etch looks like:

        22 months as testing
        18 (?) months as stable
        18 months as oldstable (with security support and possibly
point updates)

That's 58 months or just under five years, longer if lenny takes more
than 18 months to release, and also if security support gets extended.

sarge's lifespan should be something like:

        20 Jul 2002 - sarge as testing
         6 Jun 2005 - sarge as stable
           Apr 2007 - sarge as oldstable
           Oct 2008 - sarge retired

which would be a bit over six years or about 74 or 75 months.

Note that the "testing" lifetime has the suite's pieces vary from an
exact match of one stable release to an exact match of the next stable
release; which is usually a pretty major variation.

sid and experimental don't have a defined endpoint; I'm not sure what
you'd want to do about them. I'm not sure what (if anything) you'd do
when a new suite (like lenny) gets introduced either.

> >> Just so you know, [...] Just something to think about.
> > Well, hopefully tomorrow we find out slot numbers so we really can think
> > about this stuff. :)
> Any word on this yet? They're 2 days overdue already. Guess you can't complain
> as they're the ones with the money in this.

Yup, we got word a couple of hours ago, only nine slots. :(

> I'll update my proposal with some of these unique piece number ideas, as I am
> starting to think that is the way to go. Thanks for all the thoughts and ideas.

Sweet. Worth adding a comment to your app with a pointer to it.

Daniel Burrows <dburrows at debian.org> and Michael Vogt <mvo at debian.org>
have both offered to help (particularly with apt-acquire code if
necessary) in the private comments on your app btw.

Cheers,
aj


-----BEGIN PGP SIGNATURE-----

iD8DBQFGFvaSOxe8dCpOPqoRAiA5AJ48Cn0pf0kjsg56vICypa1zrFug3ACgpPPA
4F7kUJuIIFKt5vqRRxIRENQ=
=vOPc
-----END PGP SIGNATURE-----



More information about the Debtorrent-devel mailing list