[Pkg-nlp-ja-devel] Bug#719557: mecab: postinst script hardcode packages to update.

Osamu Aoki osamu @ debian.org
2013年 8月 13日 (火) 03:31:06 UTC


Package: mecab
Version: 0.996-1
Severity: normal

Thanks for uploading the new version.

Current mecab hardcode package names to run dpkg-reconfigure in its
postinst.  So it misses the highest score naist-jdic and most likely
used other utf-8 version packages etc while wasting time updating
non-default unused dictionary packages here.

What you really needs is to run /usr/lib/mecab/mecab-dict-index with
proper arguments for update-alternative chosen actively used dictionary.
If user changes his dictionary manually to funky customization, README
should tell what to do too.  

(If the binary dictionary format is constant with version changes, we
should avoid running this time consuming update code, too.  That is
another issue which is next step.  For now, I am assuming dictionary
binary API changes every upstream updates.)

So what should we do :-)  Here is my suggestion.

The all mecab related binary packages are:
 * ipadic
 * ipadic-utf8
 * juman
 * juman-utf8
 * naist-jdic
 * naist-jdic-eucjp
 * open-jtalk (somewhat strange ... needs coordination.)

They basically have base text dictionary at:
 /usr/share/mecab/dic/<parent_source_package>
They install generated binary package at:
 /lib/mecab/dic/<binary_package> (common for UTF-8 and EUC)

Under normal situation, update-alternatives chosen binary_package is
used and it create link (via /etc/alternatives/mecab-dictionary):
 /lib/mecab/dic/debian
     -> /etc/alternatives/mecab-dictionary
         -> /lib/mecab/dic/<chosen_binary_package>

So each binary dictionary package should install as now plus adding a file
 /lib/mecab/dic/<binary_package>/update

This file should contain arguments to /usr/lib/mecab/mecab-dict-index. 
(Coordination should be easy since I and you are part of the maintainer
of such dictionary packages.)
 
Then provide a wrapper script /usr/bin/mecab-dict-index.
 * With only a argument "-a", /usr/bin/mecab-dict-index runs
   /usr/lib/mecab/mecab-dict-index with the content of 
   /lib/mecab/dic/debian/update (if only such file is found
   and the version recorded in /lib/mecab/dic/debian/updated
   is different from the current one.  If .../updated file is not found,
   assume the previous version to be 0.)
   --> This is to avoid running this for every Debian revision which 
       has no API change.
 * With only 2 arguments "-a debian", /usr/bin/mecab-dict-index runs
   /usr/lib/mecab/mecab-dict-index with the content of 
   /lib/mecab/dic/debian/update (if only such file is found)
   If the second argument is set to <chosen_binary_package> which is
   different from "debian" is provided, it should use
  /lib/mecab/dic/<chosen_binary_package> instead of
  /lib/mecab/dic/debian/ as the path to see update and updates files.
 * With other arguments, pass them to /usr/lib/mecab/mecab-dict-index 
   as "$@".
 * Each time /usr/bin/mecab-dict-index is run successfully, it should
   put -v output to a file /lib/mecab/dic/debian/updated (which should
   actually be /lib/mecab/dic/<chosen_binary_package>/updated.

(Please note "-a" is not used by /usr/lib/mecab/mecab-dict-index.)

The postinst of mecab should run this /usr/bin/mecab-dict-index with "-a".

The postinst of dictionary packages should run this
/usr/bin/mecab-dict-index with "-a <its_binary_package_name>".

The user changing update-alternatives should run mecab-dict-index with
"-a debian".

The package installation order and update-alternatives running timing
may cause some race condition.  But I think the above should be robust
enough while running only for dictionary used.  If other dictionary needs
to be updated after changing update-alternatives, it has easy update via
command with "-a" etc.

(This design change should be safe even if some external/local packages
are installed or a local dictionary is installed used via user running
mecab-dict-index as root.  If a user creates a user dictionary in his
home directory, he needs to re-run mecab-dict-index as expected.  This
change is compatible and he can drop specifying full path for
mecab-dict-index.)

Side note:
I do not see doxygen files in doc/doxygen and English translation in
doc/en are installed in the mecab binary package.

Package lacks the source/format file containing "3.0 (quilt)".

If you do, you do not need to explain in README.source (This file should
be removed if you properly do "3.0 (quilt)".)  Then you can drop quilt
dependency and drop patchsys-quilt.mk in debian/rules. (I am not so
familiar with CDBS, though.)  If possible, please use new standard
debhelper command 'dh "$@"' with override instead to reduce build
dependency.

If this mecab source package becomes multi-source package with
python-mecab and other script binding tars, you can run the latest swigs 
in swig directory and copy generated sources to be used to build script
binding packages.  This approach has pros and cons.  But allow us to
build python3 binding.  Shipping source file in python-mecab stating

>> Do not make changes to this file unless you know what you are doing
>> -- modify the SWIG interface file instead.
(SWIG interface file is not in the python-mecab source package.  This is
almost source less package.)
  See http://bugs.debian.org/cgi-bin/719466

-- System Information:
Debian Release: jessie/sid
  APT prefers unstable
  APT policy: (500, 'unstable'), (500, 'testing'), (500, 'stable'), (100, 'experimental')
Architecture: amd64 (x86_64)
Foreign Architectures: i386

Kernel: Linux 3.10-2-rt-amd64 (SMP w/8 CPU cores; PREEMPT)
Locale: LANG=en_US.utf8, LC_CTYPE=en_US.utf8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash

Versions of packages mecab depends on:
ii  libc6           2.17-92
ii  libgcc1         1:4.8.1-8
ii  libmecab2       0.996-1
ii  libstdc++6      4.8.1-8
ii  mecab-ipadic    2.7.0-20070801+main-1
ii  mecab-jumandic  5.1+20070304-3

mecab recommends no packages.

mecab suggests no packages.

-- no debconf information



Pkg-nlp-ja-devel メーリングリストの案内