[Debtorrent-devel] Fwd: BitTorrent Protocol Expansion (Google SoC)

Cameron Dale camrdale at gmail.com
Fri Apr 13 01:51:38 UTC 2007


---------- Forwarded message ----------
From: Cameron Dale <camrdale at gmail.com>
Date: Apr 8, 2007 10:59 PM
Subject: Re: BitTorrent Protocol Expansion (Google SoC)
To: Anthony Towns <aj at azure.humbug.org.au>


Anthony Towns wrote:
>>>     - sharing files between torrents seems a worry, but a necessary
>>>       one since we'll want to not double the space people need
>>>       to watch testing and unstable.
>> Something like the pooling done by the archive would seem to make sense.
>> We just need to be careful not to run into problems where 2 torrents
>> are trying to update the same pool file.
>
> I guess we could just do locking?

I'm not sure what you mean. File locking?

>>>     - there's a real issue if the torrents have the same "file"
>>>       with different contents (an old torrent had a file in the pool,
>>>       which was deleted, then later recreated with new contents
>>>       and included in a new torrent, eg. *should* never happen,
>>>       but not 100% assured)
>> I'm having trouble visualizing this happening. Could you give a more
>> concrete example? I would think that the new torrent would either be a
>> replacement for the old one, so you would never run both at once,
>
> One scenario is something like foo_1.0.orig.tar.gz gets uploaded for
> foo 1.0, then obsoleted by foo_1.1.orig.tar.gz, but then an epoch gets
> added and a different foo_1.0.orig.tar.gz get uploaded for foo 1:1.0.
>
> The archive won't allow that to happen simultaneously, but it could happen
> with a week or so, so that the old foo_1.0.orig.tar.gz might be on your
> system still while you're trying to get the new foo_1.0.orig.tar.gz.

I think this would work fine with the current BitTorrent implementation. While
running the old torrent, you would have the old hash for the file, and the file
would match it and be fine. When you update to the new torrent the hash would
change, and during torrent startup it would notice this, mark the file as
corrupt or something, and try to download the new piece/package to match the
hash. Do you see a possible problem in there?

I know this is a pretty rare/minor issue, but being aware of these situations
and planning for them will make everything run smoother in the end.

>> A lot of the changes we've talked about (including the one you mentioned
>> in the previous paragraph) require some kind of modification to the
>> archive software, and I haven't yet considered how easy/fast/possible
>> these changes will occur.
>
> Adding information that's calculated from the .deb to the Packages file
> (like separate sha1's for each x kB block in the .deb, for some constant
> x) is easy enough; adding information that's based on the package but not
> specific to a version is easy too; adding information that's specific
> to a particular file but can't be calculated from the file directly is
> new and presumably hard.

So, if I read this correctly, this means we can make the Packages file into a
torrent easy enough in the short term, but adding unique piece numbers would be
more difficult, as they require some kind of state information be kept. What
info do you have in mind when you say "adding information that's based on the
package but not specific to a version"? What about adding torrent identifiers to
the Release files, how hard is that?

>> For now, we can use the current Packages files
>> as single-piece-per-package torrent files (though some pieces would be
>> very large), but eventually some functionality would need to be added to
>> dak/apt-ftparchive to implement more interesting/efficient features. Who
>> would we need to talk to about this, and how responsive will they be to
>> supporting something that will be alpha/beta for a long time?
>
> The easy changes can be done with me, mvo and daniel without much problem.
>
> The hard change needs changes to apt which might be difficult to code,
> and will require mvo checking them in some detail at least; and also
> will require changes to dak, which will require more testing and review
> by ftpmaster (me, James Troup, Ryan Murray).

This is just the unique piece numbers, right? The rest is easy?

> Sounds like the next steps would be:
>
>    * break packages into smaller pieces
>
>    * determine usage patterns
>
>    * based on measured usage patterns analyse ways to optimise sharing
>      amongst different Packages files (arch:all, different versions,
>      testing/unstable, tesintg/stable at release)
>
> Hrm. I'm not sure "determine usage patterns" can happen quickly enough for
> the "analyse" step to actually happen as part of the GSoC. Any thoughts?

I think the uptake of early adopters will be too slow. Maybe by then we will
have a good enough idea on what to do anyway, without too much analysis. Or,
just go ahead and blindly pick one and start implementing it, then decide later
if it needs to be changed.

> For the first half, ordering as:
>
>    1. will share all downloaded pieces with other interested clients
>       (already what bittornado does!)
>    2. implements variable size pieces for all packages
>       (use hacked up .torrent files that will get you an "interesting" bit
>        of the archive)
>    3. uses current Packages files as torrent files
>       (add Packages -> torrent parsing into bittornado so you don't need to
>        download duplicate information)
>    4. receives input from apt about which packages/pieces to download
>       (add separate scripts to parse /var/lib/dpkg/available and/or apt
>        to prioritise pieces?)
>    5. runs as a daemon, started from /etc/init.d
>       (automate it all)
>
> might work well -- that way it's possible to start releasing usable
> betas right from step (2).

Looks good to me. I'm not sure what you mean by "add separate scripts to parse
/var/lib/dpkg/available and/or apt to prioritise pieces" though.

Cameron
-------------- next part --------------
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGGdZXDx924g0gNq0RAnNDAJwMPfH3dYsSJFsagoO7QKW4aXL5QwCgguWQ
ZM2+vVpCdgQ5a/ELcRqK1pc=
=BHgO
-----END PGP SIGNATURE-----


More information about the Debtorrent-devel mailing list