[Po4a-devel]Design of the (La)TeX module

Denis Barbier barbier@linuxfr.org
Fri, 10 Dec 2004 23:34:32 +0100


On Mon, Dec 06, 2004 at 09:47:09PM +0100, Nicolas Fran=E7ois wrote:
> Hello,
>=20
> Based on my reading on LaTeX and Nicolas' book, I will change a little
> bit the implementation.
>=20
>=20
> It is much more formalized than the previous prototype. If it works, th=
e
> content of this mail may be used as a documentation for this module.
>=20
> 1) The functions:
> parse:
>   * The parse function will only separate paragraphs. The separator is =
an
>     empty line or a line beginning by a comment.
>     What I'm calling here a "paragraph" is not a paragraph in the outpu=
t
>     document, but a bloc of code separated by one of these separators.

No, comments should simply be ignored when splitting into paragraphs.
It is not uncommon to write a comment within a paragraph.

>   * The parse function will also remove the comments from this paragrap=
h
>     and keep them in a buffer (to be pushed as PO comments if there is
>     a string to translate in this paragraph, or ignored otherwise).
>     The comments will be ignored in the localized document.
>     (This doesn't concern lines beginning by a comment, which will just=
 be
>     pushed, like empty lines)

Looks fine.

>   * Once a paragraph is found, the translation of the paragraph (built =
by
>     translate_buffer) is pushed.
>=20
>=20
> translate_buffer: return the translation of a buffer (typically a
> paragraph or a subset of a paragraph)

See above, IMO it should be a paragraph.

>   1) call get_leading_command, to handle a leading command
>      If the paragraph begins by a command, call this command's subrouti=
ne
>      with the paragraph in argument and append this translation to the
>      translated buffer.
>      Loop until there is no more leading command.
>
>   2) call get_trailing_command, to handle trailing command (loop)
>      while there is some trailing commands, call these commands, and bu=
ild
>      a translated buffer to push at the end of the current paragraph.
>   3) append the translation of the remaining paragraph (if any)
>   4) append the translation of the trailing commands

Should work mostly fine with Nicolas' book, but what are these trailing
commands?

>   * it should be possible to keep the separator between the commands
>     (could be none, a space or a newline).
>=20
> One question: Is this separator important? For example, can I re-wrap:
> \inputprotcode
> \makeindex
> \debing{document}
> \myeqnspacing
>    into:
> \inputprotcode \makeindex \begin{document} \myeqnspacing
> or even
> \inputprotcode\makeindex\begin{document}\myeqnspacing

Normally spaces, tabs and newlines are equivalent, but there are some
circumstances where they are not, as when writing source codes.
It is likely that spaces do not matter in this book, so I would say
to not bother if this is much easier for you.

> parse_command:
>   A subroutine for the commands subroutine and get_leading_command /
>   get_trailing_command
>   * take a paragraph/buffer in argument
>   * output the command name, an optional * (for \chapter*{foo}), an arr=
ay
>     of optional argument (between []), an array of argument (between {}=
),
>     and the remaining paragraph/buffer.
>=20
> Another question: Are optional arguments always before regular argument=
s?

Yes.

> get_leading_command:
>   Is probably the same as parse_command.
>=20
> get_trailing_command:
>   If the given paragraph ends by a command, then extract this command a=
nd
>   return the command name, etc. and the remaining paragraph.

Again I do not understand what makes trailing commands special, can you
please elaborate?

>   The parameter of a command can contain a command, so a simple regular
>   expression won't be sufficient.

Right.

>   To be understood as a trailing command, the command will have to end =
by
>   an argument (could be optional), or should not have any argument.
>=20
> I've read that a command is a \ followed by a string of lower and/or
> uppercase letters or a \ followed by a single nonletter.

Mostly true, you can ignore other cases for now.

[...]
> 3) Some questions:
>   * Is there some commands that need to be translated?
>     For example, somebody may want to change \noindent into a
>     \localized_noindent.

No, localizing macros is a bad idea.

[...]
> % po4a: new_command x y z t
> where
>   * x is the number of optional arguments (between [])
>       0 - no optional argument
>      -1 - variable (can it be?)

Sometimes, yes.

>       n - maximum number of optional argument (maybe -1 will be easier =
to use)
>   * y is the number of arguments
>     maybe x and y are not needed
>   * z array of indexes of the optional arguments that have to be transl=
ated
>      -1 - all optional argument should be translated
>       0 - none
>   1,3,7 - the 1st, 3rd and 7th arguments should be translated
>   * t array of indexes of the arguments that have to be translated

I do not fully understand how your parser will work, but this point
seems important.  You will always find macros which have different
kinds of arguments, and I see no other solution than yours above.

But those macros aside, is a LaTeX module very different from XML
or SGML?  It looks similar to me, there is a stack of environments,
and the parser could be told what to do with these environements by
a command like set_tags_kind (from Sgml.pm)

Denis