[Pkg-nlp-ja-devel] Bug#881231: chasen-dictutils: writes uninitialized memory to .dat files

Bernhard M. Wiedemann debianbugs @ zq1.de
2017年 11月 9日 (木) 04:49:05 UTC


Package: chasen-dictutils
Severity: wishlist
Tags: patch
User: reproducible-builds @ lists.alioth.debian.org
Usertags: toolchain randomness ASLR padding

Dear Maintainer,

While working on the “reproducible builds” effort [1] for openSUSE,
we have noticed that ipadic could not be built reproducibly [2]
and the same is the case for Debian [3].

The attached patch initializes memory written to .dat files.
Once applied, ipadic can be built reproducibly in our current
experimental framework.

 [1]: https://wiki.debian.org/ReproducibleBuilds
 [2]: https://bugzilla.opensuse.org/show_bug.cgi?id=1067269
 [3]: https://tests.reproducible-builds.org/debian/rb-pkg/unstable/amd64/ipadic.html
-------------- next part --------------
Author: Bernhard M. Wiedemann <bwiedemann suse de>
Date: 2017-11-08

Problem: when building the ipadic package it differed for every build
because its chadic.dat contains uninitialized memory
from the da_dat_t structure's padding bytes

Solution: initilize memory (including padding added by compilers)
before use

Index: chasen-2.4.4/mkchadic/dumpdic.c
===================================================================
--- chasen-2.4.4.orig/mkchadic/dumpdic.c
+++ chasen-2.4.4/mkchadic/dumpdic.c
@@ -45,6 +45,7 @@ dump_dat(lexicon_t *lex, FILE *datfile,
     long index;
     da_dat_t dat;
 
+    memset(&dat, 0, sizeof(dat));
     index = ftell(datfile);
     dat.stem_len = lex->stem_len;
     dat.reading_len = lex->reading_len;
@@ -137,6 +138,7 @@ dump_dic(lexicon_t *entries, FILE *outpu
     da_lex_t lex;
     long compound = NO_COMPOUND;
 
+    memset(&lex, 0, sizeof(lex));
     if (entries[1].pos)
 	compound = dump_compound(entries, lexfile, datfile);
 


Pkg-nlp-ja-devel メーリングリストの案内