RE: Modified DTDs

On Thu, 5 Aug 2004 clong@itlnet.net wrote:

> <?xml  version="1.0" ?><html>
> <head>

Please use plain text in E-mail.

> I modified a DTD to include the <foot> element (as an example for my
> class).

... so that the root element's content is head, body, foot, with the foot
element containing blocks. Looks OK to me, syntactically.

> Here's the page...
>
> http://www.billnchimene.com/index.html

The validator says "element "foot" undefined", which is fairly confusing,
since it clearly _is_ defined.

If you try it with the WDG validator http://www.htmlhelp.com/validator/
you get a huge number of errors of the following kind:

"http://www.billnchimene.com/DTD/extend.dtd, line 239, character 34:
omitted tag minimization parameter can be omitted only if "OMITTAG NO" is
specified on the SGML declaration"

So the validator is unable to process even the DTD correctly.
I guess the same happens on the W3C validator, with much worse
error recovery.

The WDG validator's help files explain problems with custom XML DTD:s:
http://www.htmlhelp.com/tools/validator/tips.html#customxml
And in fact
<http://www.htmlhelp.com/cgi-bin/validate.cgi?
url=http%3A%2F%2Fwww.billnchimene.com%2Findex.html&warnings=yes&xml=yes>
tells that the document passes validation.

Apparently the problem is that a validator needs to be told, or it needs
to guess, whether it is performing the job of an SGML validator or the job
of an XML validator. With predefined, catalogued DTDs, they presumably use
the FPI or the URL to resolve this. With other DTDs they probably run in
SGML mode (implying the SGML declaration for HTML). And this means that
any XML DTD will result in a disaster, unless some fix is applied.

But my analysis might be partly wrong. This is all very confusing, since
validators, believed to perform a well-defined rigorous check, actually
play fast and loose and "heuristically".

> In the past, the validator had no problems validating this page with the
> added element....

I have seen similar phenomena - actually, I think the validator first had
the limitation, then it was removed, and now re-introduced. This would
have happened for a DTD that had an element added to a long list,
thereby exceeding an internal capacity restriction. But my memory
might not serve me well, and I might confuse the situation with the fact
the the WDG validator hasn't got the limitation. Cf. to
http://lists.w3.org/Archives/Public/www-validator/2004May/0178.html
which also seems to leave the issue open. That is, the problem (or, as I
would call it, a bug - a quantitative limitation is not a but, but
claiming a valid document invalid is a but) has been reported and
acknowlegded, but there's no correction to be expected.

Thus, I would recommend using the WDG validator when working with custom
DTDs based on HTML DTDs, especially XHTML DTDs.

-- 
Jukka "Yucca" Korpela, http://www.cs.tut.fi/~jkorpela/

Received on Friday, 6 August 2004 07:30:27 UTC