W3C home > Mailing lists > Public > xproc-dev@w3.org > October 2010

Re: load can't find missing DTD when not validating against it...

From: Jostein Austvik Jacobsen <josteinaj@gmail.com>
Date: Mon, 25 Oct 2010 11:30:36 +0200
Message-ID: <AANLkTimXT8LqNvKGAtMP_jBDcc5Xy-uJDbvMrOHB2Xrh@mail.gmail.com>
To: Nic Gibson <nicg@corbas.net>
Cc: Michael Sokolov <sokolov@ifactory.com>, xproc-dev@w3.org
Thanks for the tip.

I'm generating a bunch of DITA documents (spanning multiple folders, both
maps and reference fragments). After having generated them, I use the DITA
Open Toolkit to further transform the generated content to HTML, PDF etc. So
fortunately I have control over what entities goes into the file.
Un-fortunately, the toolkit requires that the DTDs are defined. An
alternative would be to not set the doctype until the last time I p:store
the files, then iterate over them all once more and assign the appropriate
DTDs. That way I wouldn't have to avoid them during processing.

For now a temporary solution will do, but I'll switch to a custom resolver
soon.

Regards
Jostein

2010/10/25 Nic Gibson <nicg@corbas.net>

>
> On 25 Oct 2010, at 09:56, Jostein Austvik Jacobsen wrote:
>
> Oh, so XML documents that doesn't have a resolveable DTD is not only
> invalid but also ill-formed. That seems kind of unneccessary... Especially
> considering all the DTD-based documents out there already, and external
> tools which require the DTD doctype to be defined. We'll have to inspect the
> documents doctype before running XProc scripts on them. Oh well...
>
> For now I guess I'll try going for your first choice of creating an empty
> DTD (or make a copy of the actual DTD) and insert an absolute URI to it in
> the documents. Ugly, but it will work.
>
> Thanks
> Jostein
>
>
> Jostein
>
> I've worked round this problem in a similar way to Mike's suggestion
> before.
>
> You are always running the risk that the XML document contains an entity
> that was defined in the DTD. If that's the case, having the DTD is your only
> choice.
>
> However, you could try using a resolver that supports version 1.1 of the
> OASIS catalog specification. With a bit of thought and a  generally valid
> assumption you can match most DTDs with a single file. The OASIS 1.1 catalog
> spec allows for the sytemSuffix element which allows for trailing matches on
> system identifier names. Now, most DTDs use a '.dtd' suffix so something
> like:
>
> <?xml version="1.0" encoding="UTF-8"?>
> <catalog xmlns="urn:oasis:names:tc:entity:xmlns:xml:catalog"
> prefer="system">
>  <systemSuffix systemIdSuffix=".dtd" uri="dummy.dtd"/>
> </catalog>
>
> will match most DTDs. You could even define common entities in your dummy.
>
> The Apache commons resolver supports version 1.1 resolvers if I remember
> correctly. I think Norm Walsh wrote one that does too. You can use a custom
> resolver with Calabash and I assume Calumet allows it too.
>
>
>
> nic
> --
> Nic Gibson
> Corbas Consulting
> Digital Publishing Consultancy and Training
> http://www.corbas.co.uk, +44 (0)7718 906817
>
>
>
>
>
>
Received on Monday, 25 October 2010 09:31:31 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 25 October 2010 09:31:31 GMT