Re: On Henry's comment about documents with DOCTYPE but without markup declaration

John Cowan writes:

> Henry S. Thompson scripsit:
>
>> What I meant to claim, wrt the examples cited, was that documents
>> without a document type declaration _cannot_ be valid, or invalid,
>> because the definition of validity depends on _having_ a document type
>> declaration.
>
> It seems absolutely bizarre to me to introduce a sense of the word
> "invalid" (now used but not defined in the XML Rec) that means anything
> other than "not valid".  A document without a DOCTYPE cannot be valid,
> as you say, and saying it is _therefore_ invalid seems to me to be the
> most natural form of expression.

I guess it's the linguist in me.  I'm quite happy to say that

  "The present king of france is bald"

is not true, but not that it's false, or untrue.

Similarly I'm happy to say that

<html/>

is not valid, but not that it's invalid.

But _that's_ a matter of nomenclature, and as John correctly points
out, 'invalid' is a term which appears only 3 times in the spec (and
not at all in V1 of the spec), is not defined, and is arguably used
non-normatively in those three cases.

> In particular, if there is no DTD, and the user has not chosen to turn
> off reporting of validity errors, a validating parser must diagnose a
> a violation of the Element Valid VC for every element, and a violation
> of the Attribute Value Type VC for every attribute.  The definition
> of Element Valid begins "An element is valid if there is a declaration
> matching elementdecl where the Name matches the element type" and since
> there is not, every element in the document is invalid.  Similarly,
> the definition of Attribute Value Type begins "The attribute must have
> been declared", and it has not.
>
>> <!DOCTYPE html>
>> <xyzzy/>
>
> A validating parser, [must report] a violation of the Element
> Valid VC

I agree wrt that example, because, at least on a pedantic reading, it
_has_ a document type definition.  But I don't agree wrt just plain

<xyzzy/>

which definitely lacks a document type definition.  I believe John and
Paul are both arguing that a parser when invoked in validating mode
must report a violation of the Element Valid VC for such a document.

For me that case is the crux of the matter, and it asks a substantive
question.  In practice neither rxp nor xmllint report an Element Valid
VC, when invoked on that document in validating mode---are they
to be labelled non-conforming as a result?

On my interpretation, they're right not to do so.  On John's (and
Paul's, I think) they're wrong.

At the very least the wording of the definition of *valid* [1] and the
wording in the Conformance section [2] need to be brought into line.

But I'm curious what the original authors thought they were asking for
a parser to do, when invoked in validating mode, on a well-formed
document with no document type definition.

I find rxp's warning message the most illuminating:

  Document has no DTD, validating abandoned

I note, against my preference, something I've always been perplexed by
(at least I'm consistent): There are only three possible categories
allowed for a test in the metadata of the XML Test Suite [3]:
  valid
  invalid
  not-wf
and e.g. test p22pass1 in the NIST/OASIS part of the suite, which is
 <doc/>
is categorised as 'invalid'.

ht

[1] http://www.w3.org/TR/REC-xml/#dt-valid
[2] http://www.w3.org/TR/REC-xml/#dt-validating
[3] http://www.w3.org/XML/Test/xmlconf-20130923.html
-- 
       Henry S. Thompson, School of Informatics, University of Edinburgh
      10 Crichton Street, Edinburgh EH8 9AB, SCOTLAND -- (44) 131 650-4440
                Fax: (44) 131 650-4587, e-mail: ht@inf.ed.ac.uk
                       URL: http://www.ltg.ed.ac.uk/~ht/
 [mail from me _always_ has a .sig like this -- mail without it is forged spam]

Received on Tuesday, 28 January 2014 18:50:22 UTC