[Po4a-devel]Encoding options

Denis Barbier barbier@linuxfr.org
Thu, 5 Aug 2004 01:16:55 +0200


On Wed, Aug 04, 2004 at 10:45:36PM +0200, Jordi Vilalta wrote:
[...]
> >No, it must be ASCII by default because 'ascii is preferred over utf-8
> >by translators'.
> >
> 
> Well, this "detect" means that the the document specifies the charset 
> inside himself (like the xml headers: <?xml encoding='iso-8859-1'?>), 
> the format module checks it, and then this should be converted to utf-8.

Which charset will be used for the POT file?  Just to be clear, my
opinion is that if an encoding is declared (either iso-8859-1 or utf-8)
but document only contains ASCII characters, the POT file should not
declare its charset being UTF-8 (ie. charset=CHARSET is unchanged).

> >>- If nothing can determine the file encoding, assume it's in ascii and
> >>  don't convert anything (and set the po charset to something invalid, so
> >>  that the translator can set it)
> >
> >If master file contains non-ASCII characters, one can check whether it
> >is UTF-8 encoded.  In such a case, lib/Locale/Po4a/Po.pm has to write
> >  "Content-Type: text/plain; charset=UTF-8\n"
> >instead of
> >  "Content-Type: text/plain; charset=CHARSET\n"
> >in the POT file.  If translated PO files already exist, they have to
> >be converted to UTF-8 so that they can be merged with the POT file.
> 
> Do you mean that an update on the master document can cause the change 
> from ascii to utf-8 and we should convert the po files to utf-8 when 
> updating?

Yes, as soon as master document contains non-ASCII characters, PO files
have to be UTF-8 encoded.

Denis
PS: I am away on Sunday for 2 weeks, and do not know if I will be able to
read mails before leaving.