[Dict-common-dev] UTF-8 and ispell

Paul Boekholt p.boekholt at gmail.com
Sat Sep 29 10:43:02 UTC 2007


2007/9/29, Rafael Laboissiere <rafael at debian.org>:
> Yes, Perl understands "\xxx" escape sequences in strings where "xxx" is an
> octal number [1].  However, this does not help us here because when parsing
> the info-aspell file, DictionariesCommon.pm sees the string as ASCII, i.e.
> containing the "\" and [0-7] characters.

So Perl's \xxx sequences are basically the same as in S-Lang.
>
> [1] http://perldoc.perl.org/perlreref.html#ESCAPE-SEQUENCES
This is for regular expression, not normal strings, but I think the escape
sequences are mostly the same.

\x{263a} A wide hexadecimal value

\x{263a} in Perl seems to be the same as \x{263a} in s-lang, both give me
"☺" - a copyright sign. But I don't think this works the same in utf-8 and
ascii mode - for that you need hexadecimal numbers (below 256 of course).

> At any rate, the strings in jed-ispell-dicts.sl are too long for aspell-bg
> and ispell_init.sl fails here with the error message:
>
> /var/cache/dictionaries-common/jed-ispell-dicts.sl:232: String too long for buffer: found '??'
>
> Is this normal?

That sounds like a problem. I guess the string is longer than 256 characters.


More information about the Dict-common-dev mailing list