[Po4a-devel]Call for a (La)TeX module

Nicolas François nicolas.francois@centraliens.net
Mon, 6 Dec 2004 00:41:00 +0100


On Fri, Nov 26, 2004 at 06:29:24AM +0100, Martin Quinson wrote:
> For [the rare] documents not defining any new macros and sticking to
> unadulterated LaTeX, it should be rather easy to build a first prototype
> simply splitting on limits between TeX's vertical and horizontal modes. 

I think this is working for simple documents:
  paragraphs are separated.
  if a line starts by a known command, this macro is called and the
  current paragraph flushed. Otherwise, the line continue the (or start a
  new) paragraph.

>  - As usual (hello Yves), you need to distinguish between inline tags (ups,
>    macros), which you ignore (such as textit or footnotesize or $bla$), and
>    formating ones, for which you translate the argument (such as \section,
>    \subsubsection or $$bla$$). 

  My implementation may have an issue (but I don't known if it is valid
  LaTeX) with:
Hello there
\section{foo}
End of a paragraph

>  - Translate separately the content of all environment.

  I don't know what an environment is.
  Does it means, that if a command take two arguments they have to be
  translated separately?
  I think it is up to the command subroutine.

>  - Some macros need a more complex handling, I'm sure. 

  Sure!
  At this time, I've the \begin{foo}, which call the foo subroutine.

>  - Translate separately each item (of a itemize and associate).

  Done at some place. Probably command dependant

>  - Naturally translate separately each paragraph separated by empty lines.

  Done

>  - Ignore stuff like \medskip, since they are formating only. 
>    Hint: it's used in vertical mode. (if there is some \newpage, I guess
>    you're dead)

  I don't know what \medskip is.
  This is still to do. As \noindent in the linbe preceding a paragraph

> And so on and so far. I belive in this approach for simple documents. There
> is two main jobs here :
> 
>  - write a proper parser, which can detect macros, separate their arguments,
>    etc. This may be the more difficult part. tex is full of \ and { all
>    around the place. You'll have to protect them, and to come up with a
>    usable way to determine the } corresponding to a given { (so that the
>    inbetween can be treated as a macro argument).

   Still need to be done

>    Classical constructions (item) should be dealed with in there. All the
>    rest should be passed to macro handler just as in the man module.
>    
>  - read a latex definition and write the right handlers for the right macro.
>    There will be a bunch of dupplicated work if you don't do as in the man
>    module (or come up with a better idea, of course).

   The code of the current comment will need to be specified in separate
   subroutine

> Once this is done, you'll be able to deal with documents with no
> \newcommand. For new definitiones, I guess that the only viable idea is to
> go for specifically formated comments in the document (lines begining with
> '%po4a:' ?) to explain which category each macro belongs to. You may even

This seems reasonable.

Do you think footnotes should be treated separately?
In this case, How to indicate the location in the PO file?

I'm having the same issue with \index, that I separate from the paragraph,
only at their beginning and end.

Regards,
-- 
Nekral