[Po4a-devel]Sgml module does not translate the lang attribute

Martin Quinson martin.quinson@loria.fr
Thu, 26 May 2005 16:30:36 +0200


--H+4ONPRPur6+Ovig
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Thu, May 26, 2005 at 01:32:27PM +0200, Francois Gouget wrote:
> Martin Quinson wrote:
>=20
> The Xml.pm module simply creates an msgid with the lang value and lets=20
> the translator provide the new value. In fact it has a list of=20
> attributes that need translation so that one can provide translations=20
> for arbitrary attributes.
>=20
> I think this approach makes sense so here is what I did:
>=20
>  * I added a new 'attribute' 'tag kind'. I know that and attribute is=20
> not a tag but this lets me reuse the framework.
>=20
>  * I added support for an 'attribute' option which lets the user expand=
=20
> the list of attributes that need to be translated. It works exactly like=
=20
> the existing 'translate', 'section', 'indent', etc. options...
>=20
>  * For the DocBook document type I added 'lang' to the 'attribute' list.
>=20
>  * And finally near line 690 I translate the attribute value if its=20
> name is in the %attribute hash.
>=20
> There's one thing that Xml.pm supports that this does not support:=20
> Xml.pm lets you specify that an attribute must only be translated if it=
=20
> is found in a specific tag list:
>=20
>    You can specify the attributes by their name (for example, "lang"),
>    but you can prefix it with a tag hierarchy, to specify that this
>    attribute will only be translated when it's into the specified tag.
>    For example: <bbb><aaa>lang specifies that the lang attribute will
>    only be translated if it's into an <aaa> tag, and it's into a <bbb>
>    tag.
>=20
> I think this functionality can be added later if needed. For now it=20
> seems mostly overkill to me.

Well. That may be an issue. In html, some attibutes need to be translated
when in a specific tag and not in others. I think we don't need to keep the
whole stack, but the embeeding tag seem important to me.

> @@ -639,7 +652,10 @@ sub parse_file {
>      foreach (split(/ /, ($self->{SGML}->{k}{'ignore'}) || '')) {
>  	$exist{uc $_} =3D 1;
>      }
> -  =20
> +      foreach (split(/ /, ($self->{SGML}->{k}{'attribute'}) || '')) {
> +	$attribute{uc $_} =3D 1;
> +    }
> +=20
> =20
>      # What to do before parsing
> =20

Maybe change $attribute{uc $_} to an array of embeeding tags?


> There is another aspect that could be criticised: the patch will=20
> typically result in the following msgid:
>=20
> msgid "en"
> msgstr "fr"
>=20
> That can be a bit ambiguous.

That's why you did add a comment in the po file using the last arguments of
the translate function (or you should :).

> Maybe the documentation has an 'en' that needs to be translated
> differently somewhere else.

That's more problematic.

> If it is felt that this is an issue it would be pretty easy to modify the
> patch so that the msgid reads as follows:
>=20
> msgid "lang=3Den"
> msgstr "lang=3Dfr"

Well. Why not? We could then translate the whole tag, something like:

> # Please translate the lang attribute
> msgid "<book lang=3Den>"
> msgstr "<book lang=3Dfr>"

If it's feasible, it'd be even better.=20


I didn't commit it yet. I prefer to discuss it a bit further, if you don't
mind.

--H+4ONPRPur6+Ovig
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: Digital signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (GNU/Linux)

iD8DBQFCld2MIiC/MeFF8zQRAmeWAJ9+8Na69YT0yJrPSxPmoVHi2dIeYgCcDEe5
d6I4eypprOe2mTjH7w9nI7Q=
=e4Z3
-----END PGP SIGNATURE-----

--H+4ONPRPur6+Ovig--