[Po4a-devel]Sgml and generic file inclusion

Martin Quinson martin.quinson@loria.fr
Fri, 3 Jun 2005 15:03:23 +0200


--dTy3Mrz/UPE2dbVg
Content-Type: text/plain; charset=iso-8859-1
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Fri, May 27, 2005 at 01:43:38PM +0200, Nicolas Fran=E7ois wrote:
> Hello,
>=20
> With Martin having no time, but spending a lot of time on po4a, it seems
> the right moment to ask this.

Arf, cr=E9tin ;)
You're right, I should speak less of working for real life and do it more :=
-D

> First, extracting the file inclusion part from the TeX module to put it in
> the Transtractor works.
> It has mostly one drawback: the file inclusion has to be done in one line.
> For example:
>   .so foo.1
> or:
>   \include{foo.tex}\input{bar}
> should work, but:
>   <!ENTITY foo SYSTEM
>   foo_foo_foo_foo_foo_foo_foo_foo_foo_foo_foo_foo_foo_foo.sgml>
> will not work (at least currently; if it is needed, it will work)

Actually, it shouldn't be an issue. At least not for ?ml since it's not the
lines you selected which should be changed to the file content, but &foo;=
=20

> To use this file inclusion mechanism, a module will have to implement one
> function, which will find the filename and return the piece of text to put
> before and after the inclusion of the given file. (it is a littel bit more
> complicated, but basically works this way).
> For example, the Man module could return "", "foo.1", "" when it receive
> ".so foo.1"

Funny actually. I'd have done this the other way with a unshift_file()
function provided by transtractor, and all modules using it. Something like
$macro{'so'}=3Dsub {
    my $self=3Dshift;
    $self->unshiftfile($_[1]);
};

sub unshiftfile {
    my $self=3Dshift;
    my $filename=3Dshift
       or croak wrap_msg(dgettext("po4a", "Can't read from unshift_file wit=
hout having a filename"));
    my $linenum=3D0;

    my @array;
    # Get all the file into @array=20
    open INPUT,"<$filename"=20
	or croak wrap_msg(dgettext("po4a", "Can't read from %s: %s"), $filename, $=
!);
=09
    while (defined (my $textline =3D <INPUT>)) {
	$linenum++;
	my $ref=3D"$filename:$linenum";
	my @entry=3D($textline,$ref);
	push @array, @entry;

	# Detect if this file has non-ascii characters
	if($self->{TT}{ascii_input}) {
	    my $decoder =3D guess_encoding($textline);
	    if (!ref($decoder) or $decoder !~ /Encode::XS=3D/) {
	        # We have detected a non-ascii line
		$self->{TT}{ascii_input} =3D 0;
		# Save the reference for future error message
		$self->{TT}{non_ascii_ref} ||=3D $ref;
	    }
	}
    }
    close INPUT=20
	or croak wrap_msg(dgettext("po4a", "Can't close %s after reading: %s"), $f=
ilename, $!);
=09
    # unshift all the file content (in reverse order so that endoffile is u=
nshifted first)
    map {$self->unshiftline(@_[0],@_[1])} reverse @array;

}


But I may have understand wrong what you mean?

> Other facilities are also provided (for example, the TeX module needs find
> a file wether the .tex extension is given or not).
>=20
>=20
> So here are some questions regarding the Sgml module:
>   Is there any problem with the current implementation of the inclusion?

No, as long as nsgml keeps on our way, I don't think we should touch at sgml
file inclusion (which is already implemented despite the complexity induced
by nsgml cruft).=20

>   The Sgml module is quite different from the other modules (the lines are
>   not shifted one by one, but the file is given in its whole to nsgmls. So
>   is this module a good choice for testing the file inclusion? Maybe I can
>   try to implement it in order to be able to use the whole file.
>=20
>   Is there another module which could need inclusion, and could be used to
>   test if this mechanism is sufficient and generic enough

Man and Xml are good candidate for this feature, IMHO.

> I will submit the patch to the list, so you can have a look at it. But I'm
> a little bit reluctant at committing it before it is tested by at least
> two modules.

When have a plenty of time, I think.

Bye, Mt.

--dTy3Mrz/UPE2dbVg
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: Digital signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (GNU/Linux)

iD8DBQFCoFUaIiC/MeFF8zQRArC+AKCXfLPOEpglXQv7MC/UveAWFhjs/gCeNJ5p
dXaG1074p0bWkAiKo0RFk3E=
=Lex3
-----END PGP SIGNATURE-----

--dTy3Mrz/UPE2dbVg--