[Pkg-bazaar-commits] ./bzr/unstable r224: doc

mbp at sourcefrog.net mbp at sourcefrog.net
Fri Apr 10 07:43:59 UTC 2009


------------------------------------------------------------
revno: 224
committer: mbp at sourcefrog.net
timestamp: Sat 2005-04-09 16:00:35 +1000
message:
  doc
modified:
  bzrlib/revfile.py
-------------- next part --------------
=== modified file 'bzrlib/revfile.py'
--- a/bzrlib/revfile.py	2005-04-09 05:51:22 +0000
+++ b/bzrlib/revfile.py	2005-04-09 06:00:35 +0000
@@ -61,6 +61,12 @@
 balanced tree indexed by SHA1 so we can much more efficiently find the
 index associated with a particular hash.  For 100,000 revs we would be
 able to find it in about 17 random reads, which is not too bad.
+
+This performs pretty well except when trying to calculate deltas of
+really large files.  For that the main thing would be to plug in
+something faster than difflib, which is after all pure Python.
+Another approach is to just store the gzipped full text of big files,
+though perhaps that's too perverse?
 """
  
 
@@ -73,6 +79,12 @@
 # TODO: Some kind of faster lookup of SHAs?  The bad thing is that probably means
 # rewriting existing records, which is not so nice.
 
+# TODO: Something to check that regions identified in the index file
+# completely butt up and do not overlap.  Strictly it's not a problem
+# if there are gaps and that can happen if we're interrupted while
+# writing to the datafile.  Overlapping would be very bad though.
+
+
 
 import sys, zlib, struct, mdiff, stat, os, sha
 from binascii import hexlify, unhexlify



More information about the Pkg-bazaar-commits mailing list