[Reproducible-commits] [debbindiff] 11/19: Stop feeding input to diff after a certain amount of lines

Jérémy Bobbio lunar at moszumanska.debian.org
Tue Mar 31 14:59:29 UTC 2015


This is an automated email from the git hooks/post-receive script.

lunar pushed a commit to branch pu/feed-diff
in repository debbindiff.

commit 1c5212b9eb2c87d3bc5d4fbc7f7a64abd20d7a79
Author: Jérémy Bobbio <lunar at debian.org>
Date:   Mon Mar 30 16:29:06 2015 +0200

    Stop feeding input to diff after a certain amount of lines
    
    GNU diff is unable to cope with very large files as its memory usage
    grows too much. So we now have a maximum input size to avoid getting OOM.
---
 debbindiff/difference.py | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/debbindiff/difference.py b/debbindiff/difference.py
index dd1c3b0..e42d955 100644
--- a/debbindiff/difference.py
+++ b/debbindiff/difference.py
@@ -33,6 +33,7 @@ from debbindiff import logger, tool_required, RequiredToolNotFound
 
 MAX_DIFF_BLOCK_LINES = 50
 MAX_DIFF_LINES = 10000
+MAX_DIFF_INPUT_LINES = 100000 # GNU diff cannot process arbitrary large files :(
 
 
 class DiffParser(object):
@@ -231,9 +232,15 @@ def make_feeder_from_unicode(content):
 
 def make_feeder_from_file(in_file, filter=lambda buf: buf.encode('utf-8')):
     def feeder(out_file):
+        line_count = 0
         end_nl = False
         for buf in iter(in_file.readline, b''):
+            line_count += 1
             out_file.write(filter(buf))
+            if line_count >= MAX_DIFF_INPUT_LINES:
+                out_file.write('[ Too much input for diff ]%s\n' % (' ' * out_file.fileno()))
+                end_nl = True
+                break
             end_nl = buf[-1] == '\n'
         return end_nl
     return feeder

-- 
Alioth's /usr/local/bin/git-commit-notice on /srv/git.debian.org/git/reproducible/debbindiff.git



More information about the Reproducible-commits mailing list