Re: load can't find missing DTD when not validating against it...

On 25 Oct 2010, at 09:56, Jostein Austvik Jacobsen wrote:

> Oh, so XML documents that doesn't have a resolveable DTD is not only invalid but also ill-formed. That seems kind of unneccessary... Especially considering all the DTD-based documents out there already, and external tools which require the DTD doctype to be defined. We'll have to inspect the documents doctype before running XProc scripts on them. Oh well...
> 
> For now I guess I'll try going for your first choice of creating an empty DTD (or make a copy of the actual DTD) and insert an absolute URI to it in the documents. Ugly, but it will work.
> 
> Thanks
> Jostein

Jostein

I've worked round this problem in a similar way to Mike's suggestion before. 

You are always running the risk that the XML document contains an entity that was defined in the DTD. If that's the case, having the DTD is your only choice.

However, you could try using a resolver that supports version 1.1 of the OASIS catalog specification. With a bit of thought and a  generally valid assumption you can match most DTDs with a single file. The OASIS 1.1 catalog spec allows for the sytemSuffix element which allows for trailing matches on system identifier names. Now, most DTDs use a '.dtd' suffix so something like:

<?xml version="1.0" encoding="UTF-8"?>
<catalog xmlns="urn:oasis:names:tc:entity:xmlns:xml:catalog" prefer="system">
	<systemSuffix systemIdSuffix=".dtd" uri="dummy.dtd"/>
</catalog>

will match most DTDs. You could even define common entities in your dummy.

The Apache commons resolver supports version 1.1 resolvers if I remember correctly. I think Norm Walsh wrote one that does too. You can use a custom resolver with Calabash and I assume Calumet allows it too.



nic
--
Nic Gibson
Corbas Consulting
Digital Publishing Consultancy and Training
http://www.corbas.co.uk, +44 (0)7718 906817	
	

Received on Monday, 25 October 2010 09:10:51 UTC