[Po4a-devel] Bug#480997: pod2man turns UTF8 into "X"s

Russ Allbery rra at debian.org
Tue May 20 01:12:57 UTC 2008


Robert Luberda <robert at debian.org> writes:
> On Mon, 12 May 2008, Russ Allbery wrote:

>> pod2man doesn't support non-ISO characters because doing so completely
>> breaks (and by breaks, I mean up to and including segfaults) nroff on
>> some platforms and the intention is that pod2man output be portable
>> nroff.
>
> Previous version (5.8.8) does support it.

As far as I can tell, no version of pod2man has ever supported non-ASCII
characters except for those in the ISO 8859 subset for which it has troff
substitutions (and mangles the character in nroff).  Certainly no version
I've maintained has, and if it worked with Tom's version, that was by
accident.  Its failure mode may well have changed with Pod::Simple,
however.

It sounds like po4a was relying on unsupported behavior, which is
unfortunate; I've tried to be clear from the start that pod2man doesn't
support and won't support non-ASCII characters until it can output Unicode
directly.  Pod::Simple, which is the new parsing layer in 5.10, is much
more formal about how it handles character set parsing.

Basically, the end result is that this is not a bug that I can fix without
doing work that I'm not sure I have time to do.  I would certainly welcome
patches to teach it to (optionally) output UTF-8 directly and just assume
that the resulting device can cope; it should be a command-line option to
start with.  This may not be too bad, and if I find time, I can try to see
what I can do for the next release, but no promises there.

-- 
Russ Allbery (rra at debian.org)               <http://www.eyrie.org/~eagle/>



More information about the Po4a-devel mailing list