[Reproducible-commits] [misc] 03/04: clean-notes: raise exception when a duplicate node is encountered

Johannes Schauer josch at moszumanska.debian.org
Thu Feb 12 07:18:19 UTC 2015


This is an automated email from the git hooks/post-receive script.

josch pushed a commit to branch master
in repository misc.

commit 61c232b11ea119f4ac574b126d2674de14ba5864
Author: josch <j.schauer at email.de>
Date:   Thu Feb 12 08:11:05 2015 +0100

    clean-notes: raise exception when a duplicate node is encountered
    
     pyyaml does not check for duplicates when reading yaml.
    
     This is bad because it means that one either has to check for duplicates
     manually before entering info about a new package or some info will get lost
     when running this script because only one item will be remaining after
     parsing
    
     So instead, lets throw an error if a duplicate key was found. Using the line
     and column number in the error output, it is easy to find the offending key
     and manually merge their content.
---
 clean-notes | 53 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 53 insertions(+)

diff --git a/clean-notes b/clean-notes
index d2d605b..a9c7382 100755
--- a/clean-notes
+++ b/clean-notes
@@ -20,6 +20,59 @@ import argparse
 import psycopg2
 import requests
 
+# pyyaml does not check for duplicates when reading yaml.
+#
+# This is bad because it means that one either has to check for duplicates
+# manually before entering info about a new package or some info will get lost
+# when running this script because only one item will be remaining after
+# parsing
+#
+# So instead, lets throw an error if a duplicate key was found. Using the line
+# and column number in the error output, it is easy to find the offending key
+# and manually merge their content.
+#
+# It seems the only way to implement this is to monkey-patch the pyyaml loader.
+# To allow ScalarNode objects in a dictionary, their __eq__ and __hash__
+# methods have to be patched.
+#
+# To check for duplicates while parsing, below function compose_mapping_node
+# was taken from the pyyaml sources (Copyright © 2006 Kirill Simonov
+# <xi at resolvent.net> under a Expat/MIT license) and modified with a set of
+# already seen nodes.
+def scalar_node_eq(self, other):
+    return self.id == other.id and self.tag == other.tag and self.value == other.value
+yaml.nodes.ScalarNode.__eq__ = scalar_node_eq
+
+def scalar_node_hash(self):
+    return hash((self.id,self.tag,self.value))
+yaml.nodes.ScalarNode.__hash__ = scalar_node_hash
+
+def compose_mapping_node(self, anchor):
+    start_event = self.get_event()
+    tag = start_event.tag
+    if tag is None or tag == u'!':
+        tag = self.resolve(yaml.nodes.MappingNode, None, start_event.implicit)
+    node = yaml.nodes.MappingNode(tag, [],
+            start_event.start_mark, None,
+            flow_style=start_event.flow_style)
+    if anchor is not None:
+        self.anchors[anchor] = node
+    seen = set()
+    while not self.check_event(yaml.events.MappingEndEvent):
+        key_event = self.peek_event()
+        item_key = self.compose_node(node, None)
+        if item_key in seen:
+            raise yaml.composer.ComposerError("while composing a mapping", start_event.start_mark,
+                    "found duplicate key", key_event.start_mark)
+        seen.add(item_key)
+        item_value = self.compose_node(node, item_key)
+        node.value.append((item_key, item_value))
+    end_event = self.get_event()
+    node.end_mark = end_event.end_mark
+    return node
+yaml.composer.Composer.compose_mapping_node = compose_mapping_node
+
+
 apt_pkg.init_system()
 
 reproducible_json = 'https://reproducible.debian.net/reproducible.json'

-- 
Alioth's /usr/local/bin/git-commit-notice on /srv/git.debian.org/git/reproducible/misc.git



More information about the Reproducible-commits mailing list