[Po4a-devel]Encoding options
Denis Barbier
barbier@linuxfr.org
Thu, 5 Aug 2004 01:16:55 +0200
On Wed, Aug 04, 2004 at 10:45:36PM +0200, Jordi Vilalta wrote:
[...]
> >No, it must be ASCII by default because 'ascii is preferred over utf-8
> >by translators'.
> >
>
> Well, this "detect" means that the the document specifies the charset
> inside himself (like the xml headers: <?xml encoding='iso-8859-1'?>),
> the format module checks it, and then this should be converted to utf-8.
Which charset will be used for the POT file? Just to be clear, my
opinion is that if an encoding is declared (either iso-8859-1 or utf-8)
but document only contains ASCII characters, the POT file should not
declare its charset being UTF-8 (ie. charset=CHARSET is unchanged).
> >>- If nothing can determine the file encoding, assume it's in ascii and
> >> don't convert anything (and set the po charset to something invalid, so
> >> that the translator can set it)
> >
> >If master file contains non-ASCII characters, one can check whether it
> >is UTF-8 encoded. In such a case, lib/Locale/Po4a/Po.pm has to write
> > "Content-Type: text/plain; charset=UTF-8\n"
> >instead of
> > "Content-Type: text/plain; charset=CHARSET\n"
> >in the POT file. If translated PO files already exist, they have to
> >be converted to UTF-8 so that they can be merged with the POT file.
>
> Do you mean that an update on the master document can cause the change
> from ascii to utf-8 and we should convert the po files to utf-8 when
> updating?
Yes, as soon as master document contains non-ASCII characters, PO files
have to be UTF-8 encoded.
Denis
PS: I am away on Sunday for 2 weeks, and do not know if I will be able to
read mails before leaving.