[DRE-commits] [bsfilter] 01/04: End support for kakasi, which is ruby1.8-only

Christian Hofstaedtler zeha at moszumanska.debian.org
Fri Feb 14 14:58:32 UTC 2014


This is an automated email from the git hooks/post-receive script.

zeha pushed a commit to branch master
in repository bsfilter.

commit f1d82a7d688e58696a6bc4e93e33378eb6da7f98
Author: Christian Hofstaedtler <zeha at debian.org>
Date:   Fri Feb 14 15:54:48 2014 +0100

    End support for kakasi, which is ruby1.8-only
---
 debian/control                    |  4 +--
 debian/patches/010_disable_chasen | 52 +++++++++++++++++++++++++++++----------
 2 files changed, 41 insertions(+), 15 deletions(-)

diff --git a/debian/control b/debian/control
index 81f7596..30c42f1 100644
--- a/debian/control
+++ b/debian/control
@@ -3,7 +3,7 @@ Section: mail
 Priority: optional
 Maintainer: Debian Ruby Extras Maintainers <pkg-ruby-extras-maintainers at lists.alioth.debian.org>
 Uploaders: akira yamada <akira at debian.org>, Taku YASUI <tach at debian.org>
-Build-Depends: debhelper (>= 8), docbook-to-man, ruby | ruby-interpreter, ruby-mecab, ruby-kakasi, ruby-qdbm
+Build-Depends: debhelper (>= 8), docbook-to-man, ruby | ruby-interpreter, ruby-mecab, ruby-qdbm
 Standards-Version: 3.9.3
 Homepage: http://sourceforge.jp/projects/bsfilter/
 Vcs-Git: git://anonscm.debian.org/pkg-ruby-extras/bsfilter.git
@@ -13,7 +13,7 @@ XS-Ruby-Versions: 1.9.1 2.0
 Package: bsfilter
 Architecture: all
 Depends: ruby | ruby-interpreter, ${misc:Depends}
-Suggests: ruby-mecab | ruby-kakasi, ruby-qdbm
+Suggests: ruby-mecab, ruby-qdbm
 XB-Ruby-Version: 1.9.1 2.0
 Description: Bayesian spam filter
  Bsfilter is a spam filter which can distinguish spam mail from other mails.
diff --git a/debian/patches/010_disable_chasen b/debian/patches/010_disable_chasen
index 342ec6c..9e4d871 100644
--- a/debian/patches/010_disable_chasen
+++ b/debian/patches/010_disable_chasen
@@ -1,18 +1,42 @@
 Index: bsfilter/bsfilter/bsfilter
 ===================================================================
---- bsfilter.orig/bsfilter/bsfilter	2013-12-22 18:34:24.271015280 +0100
-+++ bsfilter/bsfilter/bsfilter	2013-12-22 18:34:34.000000000 +0100
-@@ -1068,9 +1068,6 @@ EOM
+--- bsfilter.orig/bsfilter/bsfilter	2014-02-14 15:53:15.331487185 +0100
++++ bsfilter/bsfilter/bsfilter	2014-02-14 15:53:55.626996360 +0100
+@@ -1068,11 +1068,6 @@ EOM
          else
            @m_dic_enc = Encoding::default_external
          end
 -      when "chasen"
 -        Chasen.getopt("-F", '%H %m\n', "-j")
 -        @method = Proc::new {|s| chasen(s)}
-       when "kakasi"
-         @method = Proc::new {|s| kakasi(s)}
+-      when "kakasi"
+-        @method = Proc::new {|s| kakasi(s)}
        else
-@@ -1152,31 +1149,6 @@ EOM
+         raise "internal error: unknown method #{method}"
+       end
+@@ -1095,21 +1090,6 @@ EOM
+     Reg_not_kanji_katakana = Regexp::compile("[^\xb0\xa1-\xf4\xa4\xa1\xbc\xa5\xa1-\xa5\xf6]".force_encoding('EUC-JP'))
+ #     Reg_not_kanji_katakana = Regexp::compile("[^\xb0\xa1-\xf4\xa4\xa1\xbc\xa5\xa1-\xa5\xf6]".force_encoding('ASCII-8BIT'))
+     
+-    def kakasi(str)
+-      str = str.gsub(/[\x00-\x7f]/, ' ')
+-      if (str =~ /\A +\z/)
+-        return []
+-      end
+-      array = Array::new
+-      Kakasi::kakasi("-oeuc -w", str).scan(/\S+/).each do |token|
+-        token.gsub!(Reg_not_kanji_katakana, '')
+-        if ((token =~ Reg_kanji) || (token.length > 2))
+-          array.push(token)
+-        end
+-      end
+-      return array
+-    end
+-    
+     def mecab(str)
+       str = str.encode(@m_dic_enc, :invalid => :replace, :undef => :replace, :replace => ' ')
+       str = str.gsub(/[\x00-\x7f]/, ' ')
+@@ -1152,31 +1132,6 @@ EOM
        return array
      end
      
@@ -44,28 +68,30 @@ Index: bsfilter/bsfilter/bsfilter
      def block(str)
        tokens = str.scan(Reg_kanji)
        tokens.concat(str.scan(Reg_katakana))
-@@ -2013,7 +1985,7 @@ OPTIONS
+@@ -2013,7 +1968,7 @@ OPTIONS
  		specify the name of database type
  		"sdbm" by default
  
 -        --jtokenizer|-j bigram|block|mecab|chasen|kakasi
-+        --jtokenizer|-j bigram|block|mecab|kakasi
++        --jtokenizer|-j bigram|block|mecab
  		specify algorithm of a tokenizer for Japanese language
  		"bigram" by default
  
-@@ -3199,8 +3171,6 @@ EOM
+@@ -3199,10 +3154,6 @@ EOM
      when "block"
      when "mecab"
        require 'MeCab'
 -    when "chasen"
 -      require 'chasen.o'
-     when "kakasi"
-       require 'kakasi'
+-    when "kakasi"
+-      require 'kakasi'
      else
+       soft_raise(sprintf("#{$0}: unsupported argument `%s' for --jtokenizer or -j\n", options["jtokenizer"]))
+     end
 Index: bsfilter/test/test.rb
 ===================================================================
---- bsfilter.orig/test/test.rb	2013-12-22 18:34:24.271015280 +0100
-+++ bsfilter/test/test.rb	2013-12-22 18:34:34.000000000 +0100
+--- bsfilter.orig/test/test.rb	2014-02-14 15:53:15.331487185 +0100
++++ bsfilter/test/test.rb	2014-02-14 15:53:19.000000000 +0100
 @@ -228,14 +228,9 @@ class TestMultipleInstances < Test::Unit
      @bsfilter2.setup($default_options + ["--jtokenizer", "bigram"])
      @bsfilter2.use_dummyfh

-- 
Alioth's /usr/local/bin/git-commit-notice on /srv/git.debian.org/git/pkg-ruby-extras/bsfilter.git



More information about the Pkg-ruby-extras-commits mailing list