DRAFT response to Comment about documents with an "empty DTD"

Below is my draft response to the comment whose subject is:

Clarify that documents with DOCTYPE but without markup
declaration are not subject to validation

If you have any thoughts or reservations, please comment.

Henry (and others), please see a separate email where I
respond to your email at
http://lists.w3.org/Archives/Public/xml-editor/2014JanMar/0004

paul

======================================================

In response to the following comment [some content elided
and other rearranged]:

> Clarify that documents with DOCTYPE but without markup
> declaration are not subject to validation
>
> . . .
> XML 1.0 fifth edition says:
>
> “[Definition: An XML document is valid if it has an associated
> document type declaration and if the document complies with
> the constraints expressed in it.]”
>
> Question: But which constraints does a document type declaration
> without an internal or external DTD express?
>
> . . .
>
> [S]ome XML tools reports validation constraint errors for
> documents with the HTML5 doctype. This happens because the very
> HTML5 DOCTYPES apparently causes some tools to dip into DTD
> validation mode - and subsequently report all elements and
> attributes as an error, since none of them are defined in
> the (non-existing) DTD.
>
> When trying to discuss this behavior when XML tools developers, it
> would be helpful to have an authoritative statement to point to.
>
> Therefore, my proposal is to extract rules or guidance for what
> to do when the DOCTYPE declaration points to no markup declaration
> and place this into the 6th edition of XML. (Or to put it differently:
> define what to do when the DOCTYPE lacks an internal or external DTD.)
>

At [1] we have:

Definition: An XML document is valid if it has an associated
document type declaration and if the document complies with
the constraints expressed in it.

At [2] we have:

validity constraint

[Definition: A rule which applies to all valid XML documents.
Violations of validity constraints are errors; they MUST, at
user option, be reported by validating XML processors.]

As indicated above, a document is not valid if it violates a
validity constraint. Perhaps that could be made clearer in
the definition of "valid" at [1]. But given that fact, and
given the "Element Valid" validity constraint at [3], and the
"Attribute Value Type" validity constraint at [4], a document
containing any element or attribute for which there is no
declaration in the associated DTD is not valid.

Put another way, one of the constraints a DTD puts on a
document (for the document to be considered valid) is that
the document must not contain any element or attribute that
is not declared in the DTD. So a DTD that declares no
elements or attribute constrains the document to have
no elements or attributes to be considered valid.

As far as "documents with DOCTYPE but without markup
declaration are not subject to validation", the XML spec has
no concept of "subject to validation". That is a tool issue.
Per section 5.1 Validating and Non-Validating Processors [5]:

Conforming XML processors fall into two classes: validating
and non-validating.

No where does the spec say that anything in the document (e.g.,
a doctype declaration) forces use of a validating processor.

HTML5 is not XML, and it can make its own rules about how
documents should be processed. Admittedly, if a tool is
using an XML processor to process an HTML5 document, it
should probably not use validation mode, but that is not
something for the XML spec to address.

Paul Grosso
for the XML Core WG


[1] http://www.w3.org/TR/REC-xml/#dt-valid
[2] http://www.w3.org/TR/REC-xml/#dt-vc
[3] http://www.w3.org/TR/REC-xml/#elementvalid
[4] http://www.w3.org/TR/REC-xml/#ValueType
[5] http://www.w3.org/TR/REC-xml/#proc-types

Received on Monday, 27 January 2014 22:31:52 UTC