[Po4a-devel]Some extensive testing of the man module

Martin Quinson mquinson@ens-lyon.fr
Sun, 15 Aug 2004 21:22:11 -0700


--VV4b6MQE+OnNyhkM
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

Hello,=20

by curiosity, I lauched again my script which tries to normalize all man
pages of my hard disk. That is to say, it takes each page, put it through
po4a and spit it out without translation of any kind. Then, it asks groff to
reformat both the original and the result of po4a, and compare the results.

Here are some results:

There is 6531 pages on my machine (not accounting translated ones).

2078 pages are perfectly handled.
  96 are only rewrapped (diff sees a difference, wdiff doesn't)
 791 produce a difference at wdiff level.
     Most of those changes are ignorable (different hyphenation, change
       ``bla'' to "bla", ie in fact \*(lqbla\*(rq which is the proper groff
       way of life)=20
     But some of them are real errors. Unfortunately, I do not have the time
       so sort out which is what.

1884 pages are ignored.=20
 1689 are generated from Pod::Man (use pod module instead)=20
   11 are generated from docbook-to-man (use sgml module instead)
  184 are generated from docbook2man (use sgml module instead)

1682 use a construct that the parser refuses to handle. Note that the parser
     stops on the first one, so the counts are biaised.
  615 use imbriqued font modifier. We could deal with that, but I'm not in a
      hurry. Most of the time, authors make mistake when tricking with font
      modifiers. I'm glad po4a helps to fix it.
  252 include files. We want to fix that.
  242 define new macros with .de (67 of them because they use db2man. Once
      we have a released xml module, those can be handled with the proper
      module). Not much we can do for the others, I guess.
  205 use macros we do not know yet. The details is attached. A bunch of
      them seem to be typo. The other should be added.
  144 use conditionals (4 .ie/140 .if). I can't remember whether we have a
      chance to deal with them.
   64 are mdoc(7) formated. We may want to add support for this.
   59 use non-ascii chars, but we failed to detect the encoding.
   55 put the args of .B or .I on the next line (36 .B/19 .I). It should be
      possible to deal with that. example: xbubble(6).
   46 use the \c escape char. I forgot what it does, but I remember there is
      not much we can do.=20

I have the full log of all details, if you're interested.

So, as a result, po4a can translate 2/3 of the man pages on my hard disk
without a glich, which is not bad. Some of the remaining pages need to be
patched, few of them seems to request po4a to get patched.

For the story, I have 2 pages I wrote myself on my harddisk (xbubble(6) and
quilt(1)), and none of them are po4a-friendly. The first puts .B arguments
on the next line, and the second contains a typo (use of ." instead of .\")
I need to fix it ;)

Here are some bugs I've identified by glancing the logs:
**BUG 1**
=2EBI -a\  addresses  =3D> .BI "-a addresses"
(example: nws_sensor.1)

**BUG 2**
  blabliblu \
  bloblobla
changed to
  blabliblu  bloblobla
(example: gettimeofday.2)

**BUG 3**
The wrapper changes remove double spaces after the points. This is where all
wrapping changes (96 pages at least) seem to come from.

**BUG 4**
=2EIB arg .array \fR.
The argument
=2EI semnum
is ignored.

changed to I<arg>B<.array>. I<The argument semnum> is ignored.
(example: semctl.2)

Of course, I plan to fix all of them one day ;)

Thanks, Mt.

--=20
Each language has its purpose, however humble.  Each language expresses the
Yin and Yang of software.  Each language has its place within the Tao.
But do not program in COBOL if you can avoid it.
          -- The Tao of programming

--VV4b6MQE+OnNyhkM
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: Digital signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)

iD8DBQFBIDZzSJAMsfOxudIRAlCfAJ45ULb/7WjceFrahM8mLFpeShb4hQCcCP88
H+fGCVA1aTIrtqM0/qMOaK4=
=LrZP
-----END PGP SIGNATURE-----

--VV4b6MQE+OnNyhkM--