- From: Henry S. Thompson <ht@inf.ed.ac.uk>
- Date: Tue, 28 Jan 2014 18:49:28 +0000
- To: John Cowan <cowan@mercury.ccil.org>
- Cc: Paul Grosso <paul@paulgrosso.name>, core <public-xml-core-wg@w3.org>, "C. M. Sperberg-McQueen" <cmsmcq@blackmesatech.com>, Tim Bray <twbray@google.com>, Jean Paoli <jeanpa@microsoft.com>
John Cowan writes: > Henry S. Thompson scripsit: > >> What I meant to claim, wrt the examples cited, was that documents >> without a document type declaration _cannot_ be valid, or invalid, >> because the definition of validity depends on _having_ a document type >> declaration. > > It seems absolutely bizarre to me to introduce a sense of the word > "invalid" (now used but not defined in the XML Rec) that means anything > other than "not valid". A document without a DOCTYPE cannot be valid, > as you say, and saying it is _therefore_ invalid seems to me to be the > most natural form of expression. I guess it's the linguist in me. I'm quite happy to say that "The present king of france is bald" is not true, but not that it's false, or untrue. Similarly I'm happy to say that <html/> is not valid, but not that it's invalid. But _that's_ a matter of nomenclature, and as John correctly points out, 'invalid' is a term which appears only 3 times in the spec (and not at all in V1 of the spec), is not defined, and is arguably used non-normatively in those three cases. > In particular, if there is no DTD, and the user has not chosen to turn > off reporting of validity errors, a validating parser must diagnose a > a violation of the Element Valid VC for every element, and a violation > of the Attribute Value Type VC for every attribute. The definition > of Element Valid begins "An element is valid if there is a declaration > matching elementdecl where the Name matches the element type" and since > there is not, every element in the document is invalid. Similarly, > the definition of Attribute Value Type begins "The attribute must have > been declared", and it has not. > >> <!DOCTYPE html> >> <xyzzy/> > > A validating parser, [must report] a violation of the Element > Valid VC I agree wrt that example, because, at least on a pedantic reading, it _has_ a document type definition. But I don't agree wrt just plain <xyzzy/> which definitely lacks a document type definition. I believe John and Paul are both arguing that a parser when invoked in validating mode must report a violation of the Element Valid VC for such a document. For me that case is the crux of the matter, and it asks a substantive question. In practice neither rxp nor xmllint report an Element Valid VC, when invoked on that document in validating mode---are they to be labelled non-conforming as a result? On my interpretation, they're right not to do so. On John's (and Paul's, I think) they're wrong. At the very least the wording of the definition of *valid* [1] and the wording in the Conformance section [2] need to be brought into line. But I'm curious what the original authors thought they were asking for a parser to do, when invoked in validating mode, on a well-formed document with no document type definition. I find rxp's warning message the most illuminating: Document has no DTD, validating abandoned I note, against my preference, something I've always been perplexed by (at least I'm consistent): There are only three possible categories allowed for a test in the metadata of the XML Test Suite [3]: valid invalid not-wf and e.g. test p22pass1 in the NIST/OASIS part of the suite, which is <doc/> is categorised as 'invalid'. ht [1] http://www.w3.org/TR/REC-xml/#dt-valid [2] http://www.w3.org/TR/REC-xml/#dt-validating [3] http://www.w3.org/XML/Test/xmlconf-20130923.html -- Henry S. Thompson, School of Informatics, University of Edinburgh 10 Crichton Street, Edinburgh EH8 9AB, SCOTLAND -- (44) 131 650-4440 Fax: (44) 131 650-4587, e-mail: ht@inf.ed.ac.uk URL: http://www.ltg.ed.ac.uk/~ht/ [mail from me _always_ has a .sig like this -- mail without it is forged spam]
Received on Tuesday, 28 January 2014 18:50:22 UTC