[Po4a-devel][RFC] Multi-lines verbatim blocks

Fri, 19 Nov 2004 16:22:44 +0100

--Dxnq1zWXvFF0Q93v
Content-Type: text/plain; charset=iso-8859-1
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Thu, Nov 18, 2004 at 11:40:38PM +0100, Nicolas Fran=E7ois wrote:
> On Tue, Nov 16, 2004 at 01:43:35PM +0100, martin.quinson@imag.fr wrote:
> > On Mon, Nov 15, 2004 at 11:55:31PM +0100, Nicolas Fran=E7ois wrote:
> > > Hello,
> > >=20
> > > To solve an issue with the man module, I've implemented a way to spec=
ify
> > > some (multi-lines) verbatim blocks.
> >=20
> > I'm not sure I understand what you're talking about. Could you please g=
ive
> > an example of use? Why can't you simply pushline all the lines until you
> > encounter the end block boundary?=20
>=20
> Here is what most tetex-bin manpages contain:
> (it is not always exactly the same header, I'm using amstex as an example)
> .\"=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
> .if t .ds TX \fRT\\h'-0.1667m'\\v'0.20v'E\\v'-0.20v'\\h'-0.125m'X\fP
> .if n .ds TX TeX
> .ie t .ds OX \fIT\v'+0.25m'E\v'-0.25m'X\fP\" for troff
> .el .ds OX TeX\" for nroff
> .\" the same but obliqued
> .\" BX definition must follow TX so BX can use TX
> .if t .ds BX \fRB\s-2IB\s0\fP\*(TX
> .if n .ds BX BibTeX
> .\" LX definition must follow TX so LX can use TX
> .if t .ds LX \fRL\\h'-0.36m'\\v'-0.15v'\s-2A\s0\\h'-0.15m'\\v'0.15v'\fP\*=
(TX
> .if n .ds LX LaTeX
> .if t .ds AX \fRA\\h'-0.1667m'\\v'0.20v'M\\v'-0.20v'\\h'-0.125m'S\fP\*(TX
> .if n .ds AX AmSTeX
> .if t .ds AY \fRA\\h'-0.1667m'\\v'0.20v'M\\v'-0.20v'\\h'-0.125m'S\fP\*(LX
> .if n .ds AY AmSLaTeX
> .\"=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
>=20
> This permits to define nice formating for TeX, LaTeX, etc. if the document
> is processed in troff mode (it only works with some devices, like ps) and
> fall back to a simple representation in nroff mode.
> These strings are later used as "\*(TX", "\*(NX", etc.
>=20
> I wanted to indicate to po4a that when this sequence of lines is
> encountered, they only have to be copied in the translation as is, without
> trying to parse them.
> (Note that finally I decided to split this block in smaller blocks for a
> better reuse, because the header vary from one page to the other)

Ok. Thanks for the example. This is close to an idea I have in mind since a
while. I never discussed it here because of lack of time (and idea's lack of
maturity).

There is a bug against po4a in Debian asking for a text module. It could
split classical itemize with '*' and indentation. Assume it's written for a
while.

I would like to write a debian changelog module by reusing this text module.
A clean way to do so is to make a "meta parser" separating each entries of
the changelog, and passing each of them to the text module above.

As you can see, this is just like your example of parts being not handled
the regular way. But I'm the more crazy over there. I want to use this kind
of feature to design meta parsers. ;)

This is also exactly the kind of feature we need for a wml module. It should
switch between html, mp4h and perl contexts.

If we add an "echo" module not changing anything to the input text (2
lines?), my idea would encompas yours.=20

I'd prefer us to think about the more general concept here. I'm not sure how
to do so, not even whether it's doable (ie, if the complexity added worth
the gain)...

> Here is another example with the cron-apt.sgml file (old #278365)
[...]
> Your fix is much better in this case. I just want to show that it could
> give users a solution for some points we can't (or are not inclined to)
> deal with.

Yeah your solution would have been really hackish. He could have edited the
original document, to :)

> I have to admit that I didn't really think about using begin and end
> boundaries.
>=20
> Boundaries are simpler to implement. They may be a little bit more risky
> (if the end boundary is not found, all the document will be untranslated),
> but it will up to the user to specify the right ones and verify the resul=
t.

Exactly. And they are more expressive. For example, I'd use <perl> as a
boundary in the wml meta-parser.=20

> (it could latter be used as an addendum by adding a header). Then these
> strings would not be given to the parser, they wouldn't appear in the
> french po during gettextization, and the English and French POs should th=
en
> match (I still did not try it).
> If needed, the addendum format could be used/supported.

Addendum were already difficult to use. I fear that after this
generalization, nobody will ever be able to use them. Including us :)

That being said, I feel that we are near from a clean design, but I cannot
narrow it yet. I've too much things to think about...

Thanks for your time,
Mt.

--Dxnq1zWXvFF0Q93v
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: Digital signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)

iD8DBQFBng/DIiC/MeFF8zQRAoa5AKCGSTxelDaKxEXQbRpKoIEEe97jNwCdGEln
SaYprWbYHzC3eDQ5EiviNyU=
=HGZF
-----END PGP SIGNATURE-----

--Dxnq1zWXvFF0Q93v--