[Pkg-nlp-ja-devel] Bug#518178: kakasi: Doesn't know ティ (ti) katakana combination

Osamu Aoki rx788473 @ tc4.so-net.ne.jp
2017年 2月 9日 (木) 15:26:02 UTC


Hi,

Maybe request should be more like:

Please make explicit list of kana <-> rominization conversion table as a
part of this package and implement features accordingly.

Currently, kakasi supports 2 romanization scheme.

 * Kunrei             --- Education ministry supported style
                          Taught in elementally school but rarely used
                          in public.  Some prefer this as a way to input
                          Japanese since it is more concise.
                          "ち" = "ti".
 * Hepburn (or Hebon) --- Foreign ministry supported style
                          Used in public road sign and names in
                          passport.  It requires extra keys to input
                          some phoneme.
                          "ち" = "chi".

As for "ティ"

 * Kunrei             --- No official way
                          "ti" is used by "チ" or "ち"
 * Hepburn (or Hebon) --- "ti" does not conflict but not a part of 
                          original Hepburn system

Then you wonder how Japanese input Japanese text via romaji-input.
Inputting "ti" yields "チ" or "ち"

There are de facto (Ichitaro/VJE/MS IME/...) ways to input.
"thi" "teli" "texi" usually works. ('x" and "l" forces next character to
be small one). I usually use "thi" since it's the shortest one.

As I test via ibus/anthy: てぃ てぃ てぃ It works.

Yes, lack of "ティ" capability is in romanization scheme itself.  

This sound only appears in borrowed foreign words.  In romanized text, we
can use the original word.  So I spell "Yokohama City" ;-)

Anyway, supporting these odd-ball new kana scriptings maybe interesting
extension. ぁぃぅぇぉヵヶゎ

I think adding -l and -x options to enable "teli" and "texi" like
outputs is one idea.  But this behavour needs to be documented.

Osamu



Pkg-nlp-ja-devel メーリングリストの案内