[Po4a-devel]Some patches for the Man module

Nicolas François nicolas.francois@centraliens.net
Sun, 10 Oct 2004 22:49:10 +0200


--rwEMma7ioTxnRzrJ
Content-Type: text/plain; charset=iso-8859-1
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Mon, Sep 27, 2004 at 11:16:39AM +0200, Martin Quinson wrote:
> On Sun, Sep 26, 2004 at 06:03:54PM +0200, Nicolas Fran=E7ois wrote:
> > On Fri, Sep 24, 2004 at 01:28:23AM +0200, Martin Quinson wrote:
> > > On Wed, Sep 22, 2004 at 11:19:30PM +0200, Nicolas Fran=E7ois wrote:
> >
> > Alioth is up and running again. Here is my account: nekral-guest.
>=20
> Done. You have the commit right. Please use it, but do not abuse it :)

OK, thank you. I've started using it:
 - one fix in the patch you committed for me
 - documentation of the scripts

> If you want we can stick to the current model for a while, where I revi=
ew
> and commit your patches. But I gave you the cvs commit right so that at
> least trivial changes (such as the usage of the testsuite stuff) can go
> faster.

For the biggest parts, I will send a patch to the list for a review.

> > My problem was: Ok, I recognized a comment inside a paragraph, what
> > should I do with this comment ?
> > Can I show it in the po (and how?)? Is there any interest in doing th=
is?
> > Can I just trash the comments?
>=20
> I'd say: push it to the po file. The only issue is that it shows a flaw=
 in
> po4a. Modules have no way [I could remember of] to push comments into t=
he po
> file :-/
>=20
> Ok, I've added one. It's not tested yet, but if you pass "comment" =3D>
> "your_comment" at the end of the translate arguments, it should do the
> trick. Now, we should integrate this cleanly into the man module (havin=
g a
> variable dedicated to the storage of the comments, and used when pushin=
g
> translations), but I don't want to interfer with your current changes i=
n
> that file.

Nice. I will try it and see if it can be useful for the translators (I'm
wondering if there isn't too much comments in some man pages)

> > > >   + nested_fonts
[...]
I've got a font stack now.
It works fine with the regression tests. It may need some more tests at
the po level (see if the po is OK)

Do you want all the \f font modifiers to disappear from the po?
If this is the case, I will add some marks for others fonts (I've already
added a CW for the Constant Width font).

Do you think \s-1 (reduce the size by one point) should be given to the
translators? Is S<-1> better?

> If you don't mind I'll let it maturate on your side. I've so little tim=
e
> myself...=20
>=20
> > 1) change all font modifiers (e.g. .B, .RI,...) to the corresponding =
\f
> >    I'm thinking of doing this in shiftline (any objection ?), because=
 I
> >    need to handle these lines in parse, in the .TP, .SH, and maybe ot=
her
> >    macro subroutines.
>=20
> overiding shiftline could be a good idea. You may want to handle some p=
arts
> of the chaos in a "upper layer". But please make sure to document this.

Overriding shiftline works fine. There is however a possible issue: it ma=
y
break unshiftline (I will probably override it with a die, or add a FILO
stack of the unshifted lines, which will be used by shiftline).

I'm also wondering which ref I should return when multiple lines have bee=
n
shifted with the Transtractor shiftline.

> > Do you think the .so/.mso part is OK ?
>=20
> Nope, it's not. But you had no way to know it. Btw, I commited all the =
others.
>=20
> For the sgml module, the policy is to include all the translations in o=
nly
> one po file, no matter how many source file there is. This is because i=
t's
> very difficult to parse a sgml sub-file alone. Where it's included in t=
he
> main document is important, as long as the entities defined in the prol=
og of
> the main document, and so on.
>=20
> So, I'd like to follow the same policy for the man pages. I know that i=
t
> will result in dupplicate in the translations, but, well, translators c=
an
> use compendiums.
>=20
> The prefered handling of .so is thus to read the included file, and the=
n
> unshift all its lines (begining by the end, of course) into the
> transtractor. Of course, this should be a function of the TransTractor
> itself such as includefile($). But we didn't agree with Jordi on the
> function prototype and syntax before my vacations...

If the file only consist in a .so (majority of the man pages which use a
.so), a warning/die could be use to recomment to only copy the original
file / make a symlink.

.mso are probably not translatable (they include tmac files, i.e.
definition of macros)




To increase the number of man pages used by the regression test, I added =
a
-M option to po4a-normalize (to specify the master charset).
I would appreciate if somebody more used to po4a could have a look at
which options could be useful for po4a-normalize.


I attach to this mail the following patches:
Man.pm.splitargs.patch
    add a splitargs function (which is at this time hardcoded in parse)
        This patch is just a code reorganization. It should not change
        anything (I need this function in another place, and it may
        simplify the parse function).
        All the code from the splitargs function comes from the parse
        function.
        The regression test doesn't show any difference.

Man.pm.splitargs.fonts.patch
(have to be applied after the previous one)
    add the font stack
        it mostly consist in a do_fonts function, called in pre_trans.
        There is also set_regular and set_font.
        If you think something need more comments, please tell me.
        I've used static variables, but I'm not sure it is the right way
        to do this.
        I've also changed a little bit the handling of .B and .BI macros
        (they return to the Roman font, not the previous font).

        Here are the results of the regression tests:
        dir1:splitargs
        dir2:splitargs.fonts

        dir1\dir2   IGN    OK   OK2  WOK1  WOK2  WOK3   PBS WDIFF
              IGN  1734     0     0     0     0     0     0     0
               OK     0   124     0     0     0     0     0     0
              OK2     0     0  1588     0     0     1     4     0
             WOK1     0     0     0   107     0     0     1     0
             WOK2     0     0     0     0   222     1     2     0
             WOK3     0     1    13     2     2   318     6     0
              PBS     0     1   132    10    41    55   596    12
            WDIFF     0     0     0     0     0     0     0    52
        total:  5025 | 5025

Man.pm.splitargs.fonts.shiftline.patch
(have to be applied after the previous one)
    add a shiftline function
        It mostly uses code from the parse function (continuation after a
        It permit (or will permit) to benefit from this part of the parse
        function in some macro subroutines. (e.g. at this time, a ".B"
        macro called after a TP (groff's indented paragraph with label) i=
s
        kept in the po, instead of providing a B<...> to the translator)

        Here are the results of the regression tests:
        dir1:splitargs.fonts
        dir2:splitargs.fonts.shiftline

        dir1\dir2   IGN    OK   OK2  WOK1  WOK2  WOK3   PBS WDIFF
              IGN  1734     0     0     0     0     0     0     0
               OK     0   126     0     0     0     0     0     0
              OK2     0     0  1729     1     0     0     1     2
             WOK1     0     0     0   118     0     0     1     0
             WOK2     0     0     0     0   265     0     0     0
             WOK3     0     1     8     3     7   353     1     2
              PBS     0     0     0     0     0     0   609     0
            WDIFF     0     0     0     0     0     0     4    60
        total:  5025 | 5025

These patches introduce some issues (for a dozen of pages). These issues
are under control, and will be fixed later (I wanted to keep those patche=
s
as simple as possible).
These patches mostly allow more pages for translation, and further
corrections.

Any comment (even stylistic) will be appreciated.
Subroutines' name and position can also be changed before inclusion in CV=
S.


Kind Regards
--=20
Nekral

--rwEMma7ioTxnRzrJ
Content-Type: text/plain; charset=us-ascii
Content-Disposition: attachment; filename="Man.pm.splitargs.patch"

--- Man.pm.orig	2004-10-10 22:07:10.000000000 +0200
+++ Man.pm	2004-10-10 22:07:10.000000000 +0200
@@ -513,95 +513,11 @@ sub parse{
 	    $arg1 .= $2;
 	    my $macro=$2;
 	    my $arguments=$3;
-	    
+
 	    # Split on spaces for arguments, but not spaces within double quotes
 	    my @args=();
-	    my $buffer="";
-	    my $escaped=0;
-	    # change non-breaking space before to ensure that split does what we want
-	    # We change them back before pushing into the arguments. The one which will be
-	    # translated will have the same change again (in pre_trans and post_trans), but
-	    # the ones which won't get translated are not changed anymore. Let's play safe.
-	    $line =~ s/\\ /\xA0/g;
-	    $arguments =~ s/\\ /\xA0/g;
-	    $arguments =~ s/^ +//;
 	    push @args,$arg1;
-	    
-	    foreach my $elem (split (/ +/,$arguments)) {
-		print STDERR ">>Seen $elem(buffer=$buffer;esc=$escaped)\n"
-		    if ($debug{'splitargs'});
-
-		if (length $buffer && !($elem=~ /\\$/) ) {
-		    $buffer .= " ".$elem;
-		    print STDERR "Continuation of a quote\n"
-			if ($debug{'splitargs'});
-		    # print "buffer=$buffer.\n";
-		    if ($buffer =~ m/^"(.*)"(.+)$/) {
-			print STDERR "End of quote, with stuff after it\n"
-			    if ($debug{'splitargs'});
-			my ($a,$b)=($1,$2);
-			$a =~ s/\xA0/\\ /g;
-			$b =~ s/\xA0/\\ /g;
-			push @args,$a;
-			push @args,$b;
-			$buffer = "";
-		    } elsif ($buffer =~ m/^"(.*)"$/) {
-			print STDERR "End of a quote\n"
-			    if ($debug{'splitargs'});
-			my $a = $1;
-			$a =~ s/\xA0/\\ /g;
-			push @args,$a;
-			$buffer = "";
-		    } elsif ($escaped) {
-			print STDERR "End of an escaped sequence\n"
-			    if ($debug{'splitargs'});
-			unless(length($elem)){
-				die sprintf(dgettext("po4a",
-					"po4a::man: %s: Escaped space at the end of macro arg. With high\n".
-					"po4a::man: probability, it won't do the trick with po4a (because of\n".
-					"po4a::man: wrapping). You may want to remove it and use the .nf/.fi groff\n".
-					"po4a::man: macro to control the wrapping."),
-					 $ref)."\n";
-			}
-			$buffer =~ s/\xA0/\\ /g;
-			push @args,$buffer;
-			$buffer = "";
-			$escaped = 0;
-		    }
-		} elsif ($elem =~ m/^"(.*)"$/) {
-		    print STDERR "Quoted, no space\n"
-			if ($debug{'splitargs'});
-		    my $a = $1;
-		    $a =~ s/\xA0/\\ /g;
-		    push @args,$a;
-		} elsif ($elem =~ m/^"/) { #") {
-		    print STDERR "Begin of a quoting arg\n"
-			if ($debug{'splitargs'});
-		    $buffer=$elem;
-		} elsif ($elem =~ m/^(.*)\\$/) {
-		    print STDERR "escaped space after $1\n"
-			if ($debug{'splitargs'});
-		    # escaped space
-		    $buffer = ($buffer?$buffer:'').$1." ";
-		    $escaped = 1; 
-		} else {
-		    print STDERR "Unquoted arg, nothing to declare\n"
-			if ($debug{'splitargs'});
-		    push @args,$elem;
-		    $buffer=""
-		}
-	    }
-	    if ($buffer) {
-		$buffer=~ s/"//g; #"
-		$buffer =~ s/\xA0/\\ /g;
-		push @args,$buffer;
-	    }
-	    if ($debug{'splitargs'}) {
-		print STDERR "ARGS=";
-		map { print STDERR "$_^"} @args;
-		print STDERR "\n";
-	    }
-	    # Done with spliting the args. Do the job.
+            push @args, splitargs($ref,$arguments);
 
 	    if ($macro eq 'B' || $macro eq 'I' || $macro eq 'R') {
 		# pass macro name
@@ -772,6 +688,97 @@ sub docheader {
            ".\\\" \n";
 }
 
+# Split request's arguments.
+# see:
+#     info groff --index-search "Request Arguments"
+sub splitargs {
+    my ($ref,$arguments) = ($_[0],$_[1]);
+    my @args=();
+    my $buffer="";
+    my $escaped=0;
+    # change non-breaking space before to ensure that split does what we want
+    # We change them back before pushing into the arguments. The one which will be
+    # translated will have the same change again (in pre_trans and post_trans), but
+    # the ones which won't get translated are not changed anymore. Let's play safe.
+    $arguments =~ s/\\ /\xA0/g;
+    $arguments =~ s/^ +//;
+    foreach my $elem (split (/ +/,$arguments)) {
+        print STDERR ">>Seen $elem(buffer=$buffer;esc=$escaped)\n"
+            if ($debug{'splitargs'});
+
+        if (length $buffer && !($elem=~ /\\$/) ) {
+            $buffer .= " ".$elem;
+            print STDERR "Continuation of a quote\n"
+                if ($debug{'splitargs'});
+            # print "buffer=$buffer.\n";
+            if ($buffer =~ m/^"(.*)"(.+)$/) {
+                print STDERR "End of quote, with stuff after it\n"
+                    if ($debug{'splitargs'});
+                my ($a,$b)=($1,$2);
+                $a =~ s/\xA0/\\ /g;
+                $b =~ s/\xA0/\\ /g;
+                push @args,$a;
+                push @args,$b;
+                $buffer = "";
+            } elsif ($buffer =~ m/^"(.*)"$/) {
+                print STDERR "End of a quote\n"
+                    if ($debug{'splitargs'});
+                my $a = $1;
+                $a =~ s/\xA0/\\ /g;
+                push @args,$a;
+                $buffer = "";
+            } elsif ($escaped) {
+                print STDERR "End of an escaped sequence\n"
+                    if ($debug{'splitargs'});
+                unless(length($elem)){
+                    die sprintf(dgettext("po4a",
+                                "po4a::man: %s: Escaped space at the end of macro arg. With high\n".
+                                "po4a::man: probability, it won't do the trick with po4a (because of\n".
+                                "po4a::man: wrapping). You may want to remove it and use the .nf/.fi groff\n".
+                                "po4a::man: macro to control the wrapping."),
+                                 $ref)."\n";
+                }
+                $buffer =~ s/\xA0/\\ /g;
+                push @args,$buffer;
+                $buffer = "";
+                $escaped = 0;
+            }
+        } elsif ($elem =~ m/^"(.*)"$/) {
+            print STDERR "Quoted, no space\n"
+                if ($debug{'splitargs'});
+            my $a = $1;
+            $a =~ s/\xA0/\\ /g;
+            push @args,$a;
+        } elsif ($elem =~ m/^"/) { #") {
+            print STDERR "Begin of a quoting arg\n"
+                if ($debug{'splitargs'});
+            $buffer=$elem;
+        } elsif ($elem =~ m/^(.*)\\$/) {
+            print STDERR "escaped space after $1\n"
+                if ($debug{'splitargs'});
+            # escaped space
+            $buffer = ($buffer?$buffer:'').$1." ";
+            $escaped = 1;
+        } else {
+            print STDERR "Unquoted arg, nothing to declare\n"
+                if ($debug{'splitargs'});
+            push @args,$elem;
+            $buffer="";
+        }
+    }
+    if ($buffer) {
+        $buffer=~ s/"//g;
+        $buffer =~ s/\xA0/\\ /g;
+        push @args,$buffer;
+    }
+    if ($debug{'splitargs'}) {
+        print STDERR "ARGS=";
+        map { print STDERR "$_^"} @args;
+        print STDERR "\n";
+    }
+
+    return @args;
+}
 
 ##########################################
 #### DEFINITION OF THE MACROS WE KNOW ####

--rwEMma7ioTxnRzrJ
Content-Type: text/plain; charset=us-ascii
Content-Disposition: attachment; filename="Man.pm.splitargs.fonts.patch"

--- Man.pm.splitargs	2004-10-10 22:07:10.000000000 +0200
+++ Man.pm	2004-10-10 22:14:43.000000000 +0200
@@ -250,6 +250,12 @@ use Getopt::Std;
 
 my %macro; # hash of known macro, with parsing sub. See end of this file
 
+# A font start by \f and is foolowed either by
+# [.*] - a font name within brackets (e.g. [P], [A_USER_FONT])
+# (..  - a parenthesis followed by two char (e.g. "(CW")
+# .    - a single char (e.g. B, I, R, P, 1, 2, 3, 4, etc.)
+my $FONT_RE = "\\\\f(?:\\[[^\\]]*\\]|\\(..|[^\\(\\[])";
+
 sub initialize {}
 
 #########################
@@ -258,6 +264,7 @@ sub initialize {}
 my %debug=('splitargs' => 0, # see how macro args are separated
 	   'pretrans' => 0,  # see pre-conditioning of translation
 	   'postrans' => 0,  # see post-conditioning of translation
+	   'fonts'    => 0,  # see font modifier handling
 	   );
 
 ###############################################
@@ -309,8 +316,14 @@ sub pre_trans {
     $str =~ s/</E<lt>/sg;
     $str =~ s/EE<lt>gt>/E<gt>/g; # could be done in a smarter way?
 
-    $str =~ s/\\f([SBI])(([^\\]*\\[^f])?.*?)\\f([PR])/$1<$2>/sg;
-    $str =~ s/\\fR(.*?)\\f[RP]/$1/sg;
+    # simplify the fonts for the translators
+    if (defined $self->{type} && $self->{type} =~ m/^(SH|SS)$/) {
+        set_regular("B");
+    }
+    $str = do_fonts($str);
+    if (defined $self->{type} && $self->{type} =~ m/^(SH|HP|SS)$/) {
+        set_regular("R");
+    }
     if ($str =~ /\\f[RSBI]/) {
 	die sprintf(dgettext("po4a",
 		"po4a::man: %s: Nested font modifiers, ie, something like:\n".
@@ -333,30 +346,6 @@ sub pre_trans {
     $str =~ s/\\\*\(rq/''/sg;
     # Change groff non-breaking space to ascii one
     $str =~ s|\\ |\xA0|sg;
-    
-# The next commented loop should take care of badly nested font modifiers,
-#  if only it worked ;)
-#
-#    while ($str =~ /^(.*)\\f([BI])(.*?)\\f([PR])(.*)$/) {
-#	my ($before,$kind,$txt,$end,$after)=($1,$2,$3,$4,$5);
-#	if ($txt =~ /(.*)\\f([BI])(.*)/) {
-#	    my ($inbefore,$kind2,$inafter)=($1,$2,$3);
-#	    #damned, we have something like:
-#	    # \fB bla\fI bli\fR
-#	    if ($end eq 'R') {
-#		# close the to modifier
-#		$str = "$before$kind<$inbefore$kind2<$inafter>>$after";
-#	    } else {
-#		# move back to the first modifier. 
-#		#Use another pass in the loop to handle external modifier
-#		$str = "$before\\f$kind$inbefore$kind2<$inafter>$after";
-#	    }
-#	} else {
-#	    # man authors are not always vicious (only often)
-#	    $str = "$before$kind<$txt>$after";
-#	}
-#    }
-
 
     print STDERR "$str\n" if ($debug{'pretrans'});
     return $str;
@@ -389,8 +378,9 @@ sub post_trans {
 	
     # Make sure we compute internal sequences right.
     # think about: B<AZE E<lt> EZA E<gt>>
-    while ($str =~ m/^(.*)([RSBI])<(.*)$/s) {
+    while ($str =~ m/^(.*)(CW|[RBIC])<(.*)$/s) {
 	my ($done,$rest)=($1."\\f$2",$3);
+	$done =~ s/CW$/\(CW/;
 	my $lvl=1;
 	while (length $rest && $lvl > 0) {
 	    my $first=substr($rest,0,1);
@@ -404,7 +394,8 @@ sub post_trans {
 	}
 	die sprintf("po4a::man: %s: ".dgettext("po4a","Unbalanced '<' and '>' in '%s'"),$ref||$self->{ref},$transstr)."\n"
 	    if ($lvl > 0);
-	$done .= "\\fR$rest";
+	# Return to the regular font
+	$done .= "\\fP$rest";
 	$str=$done;
     }
 
@@ -481,6 +472,7 @@ sub parse{
                             #          until the next fi macro.
 
   LINE:
+    undef $self->{type};
     ($line,$ref)=$self->shiftline();
     
     while (defined($line)) {
@@ -525,7 +517,7 @@ sub parse{
 		my $arg=join(" ",@args);
 		$arg =~ s/^ +//;
 		this_macro_needs_args($macro,$ref,$arg);
-		$paragraph .= "\\f$macro".$arg."\\fP\n";
+		$paragraph .= "\\f$macro".$arg."\\fR\n";
 		goto LINE;
 	    }
 	    # .BI bold alternating with italic
@@ -547,9 +539,9 @@ sub parse{
 		$paragraph.= #($paragraph?"":" ").
 		             join("",
 				  map { $i++ % 2 ? 
-					    "\\f$b$_\\fP" :
-					    "\\f$a$_\\fP"
-				      } @args)."\n";
+					    "\\f$b$_" :
+					    "\\f$a$_"
+				      } @args)."\\fR\n";
 		goto LINE;
 	    }
 
@@ -673,6 +665,7 @@ sub parse{
 
 	# Reinit the loop
 	($line,$ref)=$self->shiftline();
+	undef $self->{type};
     }
 
     if ($paragraph) {
@@ -780,6 +773,159 @@ sub splitargs {
     return @args;
 }
 
+{
+    #static variables
+    # font stack.
+    #     Keep track of the current font (because a font modifier can
+    #     stay open at the end of a paragraph), and the previous font (to
+    #     handle \fP)
+    my $current_font  = "R";
+    my $previous_font = "R";
+    # $regular_font describe the "Regular" font, which is the font used
+    # when there is no font modifier.
+    # For example, .SS use a Bold font, and thus in
+    # .SS This is a \fRsubsection\fB header
+    # the \fR and \fB font modifiers have to be kept.
+    my $regular_font  = "R";
+
+    # Set the regular font
+    # It takes the regular font in argument (when no argument is provided,
+    # it uses "R").
+    sub set_regular {
+        print STDERR "set_regular('@_')\n"
+            if ($debug{'fonts'});
+        set_font(@_);
+        $regular_font = $current_font;
+    }
+
+    sub set_font {
+        print STDERR "set_font('@_')\n"
+            if ($debug{'fonts'});
+        my $saved_previous = $previous_font;
+        $previous_font = $current_font;
+
+	if (! defined $_[0]) {
+            $current_font = "R";
+        } elsif ($_[0] =~ /^(P|\[\]|\[P\])/) {
+            $current_font = $saved_previous;
+        } elsif (length($_[0]) == 1) {
+            $current_font = $_[0];
+        } elsif (length($_[0]) == 2) {
+            $current_font = "($_[0]";
+        } else {
+            $current_font = "[$_[0]]";
+        }
+        print STDERR "r:'$regular_font', p:'$previous_font', c:'$current_font'\n"
+            if ($debug{'fonts'});
+    }
+
+    sub do_fonts {
+        # one argument: a string
+        my $str = $_[0];
+        print STDERR "do_fonts('$str')="
+	    if ($debug{'fonts'});
+
+        # restore the font stack
+        $str = "\\f$previous_font\\f$current_font".$str;
+        my @array1=split(/\\f/, $str);
+
+        $str = shift @array1;  # The first element is always empty because
+                               # the $current_font was put at the beginning
+        # $last_font indicates the last font that was appended to the buffer.
+	# It differ from $current_font because concecutive identical fonts
+	# are not written in the buffer.
+        my $last_font=$regular_font;
+
+        foreach my $elem (@array1) {
+#print STDERR "elem'".$elem."'r:$regular_font"."'l:$last_font"."'p:$previous_font"."'c:$current_font'\n";
+            # Replace \fP by the exact font (because some font modifiers will
+            # be removed, which will break groff's font stack)
+            $elem =~ s/^(P|\[\]|\[P\])/$previous_font/s;
+                # change \f1 to \fR, etc.
+                # Those fonts are defined in the DESC file, which
+                # may depend on the groff device.
+                # fonts 1 to 4 are usually mapped to R, I, B, BI
+                # TODO: use an array for the font positions. This
+                # array should be updated by .fp requests.
+                $elem =~ s/^1/R/;
+                $elem =~ s/^2/I/;
+                $elem =~ s/^3/B/;
+                $elem =~ s/^4/BI/;
+
+            if ($elem =~ /^([1-4]|B|I|R|\(CW|\[\]|\[P\]|C(?!W))(.*)$/s) {
+                # Each element should now start by a recognized font modifier
+                my $new_font = $1;
+                my $arg = $2;
+                # Update the font stack
+                $previous_font = $current_font;
+                $current_font = $new_font;
+
+                if ($new_font eq $last_font) {
+                    # continue with the same font.
+                    $str.=$arg;
+                } else {
+                    # A new font is used, update $last_font
+                    $last_font = $new_font;
+                    $str .= "\\f".$elem;
+                }
+            } else {
+                die sprintf("po4a::man: ".dgettext("po4a","Unsupported font in: '%s'."),$elem)."\n";
+            }
+        }
+        # Do some simplification (they don't change the font stack)
+        # Remove empty font modifiers at the end
+        $str =~ s/($FONT_RE)*$//s;
+
+        # close any font modifier
+        if ($str =~ /.*($FONT_RE)(.*?)$/s && $1 ne "\\f$regular_font") {
+            $str =~ s/(\n?)$/\\f$regular_font$1/;
+        }
+
+        # remove fonts with empty argument
+        while ($str =~ /($FONT_RE){2}/) {
+            # while $str has two consecutive font modifiers
+            # only keep the second one.
+            $str =~ s/($FONT_RE)($FONT_RE)/$2/s;
+        }
+
+        # the regular font modifier at the beginning of the string is not
+        # needed (the fonts subroutine ensure that every paragraph ends with
+        # the regular font.
+        $str =~ s/^\\f$regular_font//;
+
+        # Use special markup for common fonts, so that translators don't see
+        # groff's font modifiers
+        my $PO_FONTS = "B|I|R|\\(CW|C";
+        $PO_FONTS =~ s/^$regular_font\|//;
+        $PO_FONTS =~ s/\|$regular_font\|/|/;
+        $PO_FONTS =~ s/\|$regular_font$//;
+        while ($str =~ /^(.*?)                  # $1: anything (non greedy: as
+                                                # few as possible)
+                         \\f($PO_FONTS)         # followed by a common font
+                                                # modifier ($2)
+                         ((?:\\[^f]|[^\\])*)    # $3: the text concerned by
+                                                # this font (i.e. without any
+                                                # font modifier, i.e. it
+                                                # contains no '\' followed by
+                                                # an 'f'
+		         \\f                    # the next font modifier
+                         (.*)$/sx) {            # $4: anything up to the end
+            my ($begin, $font, $arg, $end) = ($1,$2,$3,$4);
+            if ($end =~ /^$regular_font(.*)$/s) {
+                # no need to add a switch to $regular_font
+                $str = $begin."$font<$arg>$1";
+            } else {
+                $str = $begin."$font<$arg>\\f$end";
+            }
+        }
+        $str =~ s/\(CW</CW</sg;
+
+        print STDERR "'$str'\n" if ($debug{'fonts'});
+#print STDERR "'r:$regular_font"."'l:$last_font"."'p:$previous_font"."'c:$current_font'\n";
+        return $str;
+    }
+}
+
 ##########################################
 #### DEFINITION OF THE MACROS WE KNOW ####
 ##########################################
@@ -857,7 +1003,15 @@ $macro{'SP'}=\&untranslated;	
 #           As a result,  all  following  paragraph(s) will be indented until
 #           the corresponding .RE.
 #  .RE      End  relative  margin indent.
-$macro{'LP'}=$macro{'P'}=$macro{'PP'}=$macro{'RE'}=\&noarg;
+$macro{'LP'}=$macro{'P'}=$macro{'PP'}=sub {
+    noarg(@_);
+
+    # From info groff:
+    # The font size and shape are reset to the default value (10pt roman if no
+    # `-rS' option is given on the command line).
+    set_font("R");
+};
+$macro{'RE'}=\&noarg;
 $macro{'RS'}=\&untranslated;
 
 #Indented Paragraph Macros
@@ -878,6 +1032,12 @@ $macro{'TP'}=sub {
 	chomp($l2);
     }
     $self->pushline($self->t($l2)."\n");
+
+    # From info groff:
+    # Note that neither font shape nor font size of the label [i.e. argument
+    # or first line] is set to a default value; on the other hand, the rest of
+    # the text has default font settings.
+    set_font("R");
 };
 
 #   Indented Paragraph Macros
@@ -920,6 +1080,11 @@ $macro{'IP'}=sub {
     } else {
 	$self->pushmacro(@_);
     }
+
+    # From info groff:
+    # Font size and face of the paragraph (but not the designator) are reset
+    # to their default values.
+    set_font("R");
 };
 
 # Hypertext Link Macros
@@ -982,7 +1147,13 @@ $macro{'fam'}=\&untranslated;
 # .fc a b   Set field delimiter to a and pad character to b.
 $macro{'fc'}=\&untranslated;
 # .ft font  Change to font name or number font;
-$macro{'ft'}=\&untranslated;
+$macro{'ft'}=sub {
+    if (defined $_[2]) {
+        set_font($_[2]);
+    } else {
+        set_font("P");
+    }
+};
 # .hc c     Set up additional hyphenation indicator character c.
 $macro{'hc'}=\&untranslated;
 # .hy       Enable hyphenation (see nh)

--rwEMma7ioTxnRzrJ
Content-Type: text/plain; charset=us-ascii
Content-Disposition: attachment; filename="Man.pm.splitargs.fonts.shiftline.patch"

--- Man.pm.splitargs.fonts	2004-10-10 22:16:50.000000000 +0200
+++ Man.pm	2004-10-10 22:16:52.000000000 +0200
@@ -267,6 +267,70 @@ my %debug=('splitargs' => 0, # see how m
 	   'fonts'    => 0,  # see font modifier handling
 	   );
 
+# This function returns the next line of the document being parsed
+# (and its reference).
+# It overload the Transtractor shiftline to handle:
+#   - font requests (.B, .I, .BR, .BI, ...)
+#     because these requests can be present in a paragraph (handled
+#     in the parse subroutine), or in argument (on the next line)
+#     of some other request (for example .TP)
+#   - font size requests (not done yet)
+#   - input escape (\ at the end of a line)
+sub shiftline {
+    my $self = shift;
+    # call Transtractor's shiftline
+    my ($line,$ref) = $self->SUPER::shiftline();
+
+    if (!defined $line) {
+        # end of file
+        return ($line,$ref);
+    }
+
+    chomp $line;
+    while ($line =~ /^\..*\\$/ || $line =~ /^(\.[BI])\s*$/) {
+        my ($l2,$r2)=$self->SUPER::shiftline();
+        chomp($l2);
+        if ($line =~ /^(\.[BI])\s*$/) {
+            $l2 =~ s/"/\\"/g;
+            $line .= ' "'.$l2.'"';
+        } else {
+            $line =~ s/\\$//;
+            $line .= $l2;
+        }
+    }
+    $line .= "\n";
+
+    # Handle font requests here
+    if ($line =~ /^[.'][\t ]*([BIR]|BI|BR|IB|IR|RB|RI)(?:(?: +|\t)(.*)|)$/) {
+        my $macro = $1;
+        my $arguments = $2;
+        my @args = splitargs($ref,$arguments);
+        if ($macro eq 'B' || $macro eq 'I' || $macro eq 'R') {
+            my $arg=join(" ",@args);
+            $arg =~ s/^ +//;
+            this_macro_needs_args($macro,$ref,$arg);
+            $line = "\\f$macro".$arg."\\fR\n";
+        }
+        # .BI bold alternating with italic
+        # .BR bold/roman
+        # .IB italic/bold
+        # .IR italic/roman
+        # .RB roman/bold
+        # .RI roman/italic
+        if ($macro eq 'BI' || $macro eq 'BR' || $macro eq 'IB' || 
+            $macro eq 'IR' || $macro eq 'RB' || $macro eq 'RI'   ) {
+            # num of seen args, first letter of macro name, second one
+            my ($i,$a,$b)=(0,substr($macro,0,1),substr($macro,1));
+            $line = join("", map { $i++ % 2 ? 
+                                    "\\f$b$_" :
+                                    "\\f$a$_"
+                                 } @args)."\\fR\n";
+        }
+    }
+
+    return ($line,$ref);
+}
+
 ###############################################
 #### FUNCTION TO TRANSLATE OR NOT THE TEXT ####
 ###############################################
@@ -478,17 +542,6 @@ sub parse{
     while (defined($line)) {
 #	print STDERR "line=$line;ref=$ref";
 	chomp($line);
-	while ($line =~ /^\..*\\$/ || $line =~ /^(\.[BI])\s*$/) {
-	    my ($l2,$r2)=$self->shiftline();
-	    chomp($l2);
-	    if ($line =~ /^(\.[BI])\s*$/) {
-		$l2 =~ s/"/\\"/g;
-		$line .= ' "'.$l2.'"';
-	    } else {
-		$line =~ s/\\$//;
-		$line .= $l2;
-	    }
-	}
 	$self->{ref}="$ref";
 #	print STDERR "LINE=$line<<\n";
 	die sprintf("po4a::man: %s: ".dgettext("po4a","Escape sequence \\c encountered. This is not handled yet.")
@@ -499,7 +552,6 @@ sub parse{
 	if ($line =~ /^\./) {
 	    die sprintf("po4a::man: ".dgettext("po4a","Unparsable line: %s"),$line)."\n"
 		unless ($line =~ /^(\.+\\*?)(\\\")(.*)/ ||
-			$line =~ /^(\.)([BI])(\W.*)/ ||
 			$line =~ /^(\.)(\S*)(.*)/);
 	    my $arg1=$1;
 	    $arg1 .= $2;
@@ -511,39 +563,6 @@ sub parse{
 	    push @args,$arg1;
             push @args, splitargs($ref,$arguments);
 
-	    if ($macro eq 'B' || $macro eq 'I' || $macro eq 'R') {
-		# pass macro name
-		shift @args;
-		my $arg=join(" ",@args);
-		$arg =~ s/^ +//;
-		this_macro_needs_args($macro,$ref,$arg);
-		$paragraph .= "\\f$macro".$arg."\\fR\n";
-		goto LINE;
-	    }
-	    # .BI bold alternating with italic
-	    # .BR bold/roman
-	    # .IB italic/bold
-	    # .IR italic/roman
-	    # .RB roman/bold
-	    # .RI roman/italic
-	    # .SB small/bold
-	    if ($macro eq 'BI' || $macro eq 'BR' || $macro eq 'IB' || 
-		$macro eq 'IR' || $macro eq 'RB' || $macro eq 'RI' ||
-		$macro eq 'SB') {
-		# pass macro name
-		shift @args;
-		# num of seen args, first letter of macro name, second one
-		my ($i,$a,$b)=(0,substr($macro,0,1),substr($macro,1));
-		# Do the job
-#		$self->pushline(".br\n") unless (length($paragraph));
-		$paragraph.= #($paragraph?"":" ").
-		             join("",
-				  map { $i++ % 2 ? 
-					    "\\f$b$_" :
-					    "\\f$a$_"
-				      } @args)."\\fR\n";
-		goto LINE;
-	    }
 
 	    if ($paragraph) {
 		do_paragraph($self,$paragraph,$wrapped_mode);

--rwEMma7ioTxnRzrJ--