[Po4a-devel]Non breaking spaces in man pages

Jordi Vilalta jvprat@wanadoo.es
Wed, 16 Feb 2005 21:25:31 +0100 (CET)


  This message is in MIME format.  The first part should be readable text,
  while the remaining parts are likely unreadable without MIME-aware tools.

---1463811584-1726710910-1108585531=:2991
Content-Type: TEXT/PLAIN; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: QUOTED-PRINTABLE

Hi,

On Wed, 16 Feb 2005, Nicolas Fran=E7ois wrote:
> On Wed, Feb 16, 2005 at 01:00:09AM +0100, Jordi Vilalta wrote:
>> I was just gettextizing some man pages and I've noticed a problem when
>> trying to mix several po files:
>>
>> $ msgcat *.po
>> file1.po:19:10: invalid multibyte sequence
>> msgcat: found 1 fatal error
>>
>> I've found that there was a strange character in that position, and it
>> seems it's the equivalent of man page's "\ ". What's its meaning? Why is
>> it handled with this strange byte? It seems we're generating non-complia=
nt
>> po files :S
>
> Yes, "\ " are changed to 0xA0. Maybe this should be done only if the
> charset used support this character (at least UTF-8 & latin-1).

Is it important to mantain a "\ " instead of converting it to a standard=20
space? When translators rewrite the message, (I think) they write standard=
=20
spaces, so the "\ " loses its posible utility. If it's important to=20
maintain them, I think it would be better to put "\ " in the po files.

> However, I'm surprised it generate an error. I'm only getting warnings
> (sometimes annoying):
> warning: The following msgid contains non-ASCII characters.
>         This will cause problems to translators who use a character encod=
ing
>         different from yours. Consider using a pure ASCII msgid instead.
>
> (There is no warning when the charset is UTF-8)
>
> Can you point me to the man page you gettextized (I will need the origina=
l
> and translated man page)?

It has happened for example with the ldd man page (along with a lot more).=
=20
There's no need to use the translated one. Here's a simple example to=20
reproduce it:

- create a simple man page that contains this line (typical):
     \-V\ \-\-version

- po4a-gettextize -f man -m file.man -p file.po

- edit file.po to put a valid charset

- msgcat file.po: with ascii and utf-8 charsets i get this:
     file.po:19:10: invalid multibyte sequence
     msgcat: found 1 fatal error

If I use iso-8859-1, for example, I get the warning you said. But msgids=20
should be valid in ascii or utf-8 (culturally neutral).

Regards,

Jordi Vilalta
---1463811584-1726710910-1108585531=:2991--