W3C home > Mailing lists > Public > xml-editor@w3.org > January to March 2014

Re: Clarify that documents with DOCTYPE but without markup declaration are not subject to validation

From: Paul Grosso <paul@paulgrosso.name>
Date: Wed, 05 Feb 2014 11:19:49 -0600
Message-ID: <52F272B5.6090200@paulgrosso.name>
To: Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no>, xml-editor@w3.org
[Some content of the original comment has been elided
and/or rearranged below.]

On 2014-01-19 14:29, Leif Halvard Silli wrote:
 > Clarify that documents with DOCTYPE but without markup
 > declaration are not subject to validation
 >
 > . . .
 > XML 1.0 fifth edition says:
 >
 > “[Definition: An XML document is valid if it has an associated
 > document type declaration and if the document complies with
 > the constraints expressed in it.]”
 >
 > Question: But which constraints does a document type declaration
 > without an internal or external DTD express?
 >
 > . . .
 >
 > [S]ome XML tools reports validation constraint errors for
 > documents with the HTML5 doctype. This happens because the very
 > HTML5 DOCTYPES apparently causes some tools to dip into DTD
 > validation mode - and subsequently report all elements and
 > attributes as an error, since none of them are defined in
 > the (non-existing) DTD.
 >
 > When trying to discuss this behavior when XML tools developers, it
 > would be helpful to have an authoritative statement to point to.
 >
 > Therefore, my proposal is to extract rules or guidance for what
 > to do when the DOCTYPE declaration points to no markup declaration
 > and place this into the 6th edition of XML. (Or to put it differently:
 > define what to do when the DOCTYPE lacks an internal or external DTD.)
 >


At [1] we have:

  Definition: An XML document is valid if it has an associated
  document type declaration and if the document complies with
  the constraints expressed in it.

At [2] we have:

  validity constraint

  [Definition: A rule which applies to all valid XML documents.
  Violations of validity constraints are errors; they MUST, at
  user option, be reported by validating XML processors.]

As indicated above, a document is not valid if it violates a
validity constraint. Perhaps that could be made clearer in
the definition of "valid" at [1]. But given that fact, and
given the "Element Valid" validity constraint at [3], and the
"Attribute Value Type" validity constraint at [4], a document
containing any element or attribute for which there is no
declaration in the associated DTD is not valid.

Put another way, one of the constraints a DTD puts on a
document (for the document to be considered valid) is that
the document must not contain any element or attribute that
is not declared in the DTD. So a DTD that declares no
elements or attributes constrains the document to have
no elements or attributes to be considered valid (and
such a document would not have a root element and would
therefore not be valid).

As far as "documents with DOCTYPE but without markup
declaration are not subject to validation", the XML spec has
no concept of "subject to validation". That is a tool issue.
Per section 5.1 Validating and Non-Validating Processors [5]:

  Conforming XML processors fall into two classes: validating
  and non-validating.

No where does the spec say that anything in the document (e.g.,
a doctype declaration) forces use of a validating processor.

HTML5 can make its own rules about how a tool should process
documents. Admittedly, if a tool is using an XML processor
to process an HTML5 document, it should probably not use
validation mode, but that is not something for the XML spec
to address.

The XML Core WG will consider issuing an erratum that augments
the definition of valid at [1] to read something like:

  Definition: An XML document is valid if it has an associated
  document type declaration and if the document complies with
  the constraints expressed in it and the document violates no
  validity constraints.

We might also add a sentence to the first paragraph of the
Conformance section at [5] so that that paragraph would
then read something like:

  Conforming XML processors fall into two classes: validating
  and non-validating.  The determination of which kind of
  processor to use for a given document is outside the scope
  of this Recommendation.

We realize this still leaves unanswered the issue of how
to decide if a document should be "subject to validation".
At the present time at least, that issue is not addressed
by the XML Recommendation.

Paul Grosso
for the XML Core WG


[1] http://www.w3.org/TR/REC-xml/#dt-valid
[2] http://www.w3.org/TR/REC-xml/#dt-vc
[3] http://www.w3.org/TR/REC-xml/#elementvalid
[4] http://www.w3.org/TR/REC-xml/#ValueType
[5] http://www.w3.org/TR/REC-xml/#proc-types
Received on Wednesday, 5 February 2014 17:20:15 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 5 February 2014 17:20:23 UTC