[Po4a-devel]Encoding options

Martin Quinson mquinson@ens-lyon.fr
Tue, 3 Aug 2004 15:40:42 -0700


Thanks for your work on encoding issues, dudes, I really suck at that.

On Wed, Aug 04, 2004 at 12:29:54AM +0200, Denis Barbier wrote:
> On Tue, Aug 03, 2004 at 11:46:50PM +0200, Jordi Vilalta wrote:

> You were talking about po4a-translate and localized file charset, and
> now gettextizing master file.  In the latter case, if master file
> contains only ASCII, no conversion is performed.  Otherwise it has to be
> recoded into UTF-8, and there is indeed a problem if original charset is
> not specified.  One could check whether it is UTF-8, and goes back to
> ISO-8859-1 otherwise, but unspecified encodings really suck, so let's
> be pedantic and force those people to declare their encoding.  After
> all they know the encoding used in their English documentation, so they
> can add the right options to po4a tools.

I'm ok with being pedentic here, too. This approach would fit me:
For the master:
 - if no encoding specified, supposed to be UTF8
 - if it's not valid UTF8, refuse to process until being given what it is
For translations:
 - if not specified, suppose it's the same than the one in translated part
   of the po file
 - could be cool if we could check that the encoding is not broken, but I'm
   not sure whether it's even possible.
 - during gettextization, assume it's UTF8 if no encoding is provided, whine
   for a proper setting if it's not the case
For po files:
 - msgid must be in UTF8. No matter what happen.
 - msgstr have to be in the encoding specified in the po file headers.
 
And once all this in implemented, we could be able to quit with assuming
that master-document = english-document ;)
 
Again, I've no definitive idea of all this should work, all this is merely a
proposition.

Thanks, Mt.

-- 
Dans la france profonde, il y a surtout des spéléologues.
   -- Le Chat