[xml/sgml] RFC: dh_installxmlcatalogs: add ability to read catalog.xml files directly

Daniel Leidert daniel.leidert.spam at gmx.net
Fri Feb 8 00:40:49 UTC 2008


x-post to debian-xml-sgml-devel at lists.alioth.d.o and debian-sgml at l.d.o,
debian-devel at l.d.o - fup2 where it fits best (I read all lists) -
probably debian-sgml

Hello,

There is an item I have on my TODO list for some time now regarding the
way dh_installxmlcatalogs works. ATM package maintainers have to
manually write the file $package.xmlcatalogs, although we often have the
following in the resulting .xmlcatalogs files (the stuff inside the
double quotation marks are simple XPath expressions to demonstrate the
value):

package;uri;"rewriteURI[@uriStartString]";/usr/share/../xmlcatalog.xml
package;public;"public[@publicId]";/usr/share/../xmlcatalog.xml
package;system;"rewriteSystem[@systemIdStartString]";/usr/share/../xmlcatalog.xml
package;system;"system[@systemId]";/usr/share/../xmlcatalog.xml

So I wonder, if dh_installxmlcatalogs should get the ability to read
catalog.xml files directly. So allow for this I think of adding a new
entry type (or more if necessary):

local-and-package

example:

local-and-package;debian/catalog.xml;/usr/share/xml/foo/custom/bar/catalog.xml

This entry should a) install the local catalog and b) register all (see
the next paragraph if you see the problem) <rewriteURI>, <public>,
<rewriteSystem> and <system> tags (more precise: their related attribute
values) with update-xmlcatalog. So package maintainers no longer have to
put the information found in the catalog.xml file into the .xmlcatalogs
file themselves. dh_installxmlcatalogs would automatically do.

However this raises a problem: E.g. in the case of DocBook XML, all
catalog.xml files contain the same entry for a public ID (-//OASIS//DTD
XML Exchange Table Model 19990315). This would lead to update-xmlcatalog
register this ID several times, which will raise an error. So
dh_installxmlcatalogs needs a way to exclude IDs and URIs - it IMO
simply needs a -X option accordingly to debhelper script switches.

I wrote a short script that uses libxml-parser-perl (XML::Parser) to
achieve the above goal to simply parse a catalog.xml file. However it
seems, this parser needs too many dependencies, so I should use a
different one.

Now I would like to request some comments regarding:

- the idea itself - critics, objections, problems I missed with my idea
- the new entry type(s) (should this also work for root catalog
entries?)
- the (chosen) Perl XML parser (dependencies, speed)

Based on the results of this discussion, I will go for xml-core 0.12.

Regards, Daniel




More information about the Debian-xml-sgml-devel mailing list