UTF-8 and ispell
G. Milde
milde at users.sourceforge.net
Fri Sep 21 16:11:49 UTC 2007
On 21.09.07, Rafael Laboissiere wrote:
> * G. Milde <milde at users.sourceforge.net> [2007-09-20 11:31]:
> Actually, my mental model of how the whole thing works was wrong. The
> jed-ispell-dicts.sl is automatically generated by dictionaries-common at
> installation time for package i<language> from the information provided in
> file debian/i<language>.info-ispell also in
> /var/lib/dictionaries-common/ispell/i<language>).
> If a new record is created in this file containing, as you suggested:
> Language: deutsch (New German 8 bit UTF-8)
> Hash-Name: ngerman
> Emacsen-Name: german-new8-utf8
> Casechars: [A-Za-zÄÖÜäößü]
> Not-Casechars: [^A-Za-zÄÖÜäößü]
> Otherchars: [']
> Many-Otherchars: no
> Additionalchars: ÄÖÜäößü
> Ispell-Args: -C -d ngerman
> Extended-Character-Mode: ~utf8
> Coding-System: utf-8
> Locale: de_DE
> then the following would appear in jed-ispell-dicts.sl:
> ispell_add_dictionary (
> "german-new8-utf8",
> "ngerman",
> "ÄÖÜäößü",
> "[']",
> "~utf8",
> "-C -d ngerman");
> So, my conclusion is that it is not jed-extra's neither
> dictionnaries-common's responsibility to provided utf-8 support for
> ispell.sl but rather it is up to the individual i<language> package to
> provide it through the debian/i<language>.info-ispell files. (I will
> consider filling bug reports against the ispell dictionary packages.)
Yes indeed this should be solved on the ispell dictionary package levels.
> The only donwside of this approach is that users will be provided with both
> choices "<language>" and "<language>-utf8" when calling
> ispell_change_dictionary although only one of them will make ispell.sl work
> correctly according to the character encoding system used.
> It would be good if non-UTF8 possibilities could be filtered out when
> _slang_utf8_ok, probably by looking at the extchr argument passed to
> ispell_add_dictionary(). [Paul: what do you think?]
A simple method would be to put the ispell_add_dictionary() in a try
clause. An invalid string arg would then result in the skipping of the
"guilty" dictionary.
You would still have <language>-utf8 with non-utf8 Jed, but this should
still work and maybe even useful in some cases.
Günter
More information about the Pkg-jed-devel
mailing list