Re: Clarify that documents with DOCTYPE but without markup declaration are not subject to validation from Leif Halvard Silli on 2014-02-08 (xml-editor@w3.org from January to March 2014)

From: Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no>
Date: Sat, 8 Feb 2014 14:23:26 +0100
To: Paul Grosso <paul@paulgrosso.name>
Cc: xml-editor@w3.org
Message-ID: <20140208142326150295.719a10ec@xn--mlform-iua.no>
Hi Paul! Some comments to things in your point 3 and 4. 

But first: Much of what you say is good. But I also sense the attitude, 
which I have seen elsewhere, that we can somehow safe ourself out of 
various XML dilemmas by making (or appearing to make) the validation 
mode stricter and stricter. I think, instead, we need some analysis of 
what is going on and of whether we - any more - understand XML the way 
it was intended.

(In reply to Paul Grosso, Thu, 06 Feb 2014 11:09:03 -0600.)

Regarding this, from your point 3:

> (Short of a tool uniquely designed to be a validator, I would expect 
> any well-designed tool to have a "non-validating mode" and a way to 
> put that tool into that mode regardless of anything in the document.)

Do you also expect, ”at user option”, to *decide* the mode?

Why do you exempt validators from your ’well-designed tool’ 
expectation? After all, we have validating[1], and non-validating[2] 
conformance checkers. Why not both kinds in one product?  The issue at 
hand  - namely, auto-magic shifts from one parser mode to the other - 
might then have been clarified earlier!

As I make clear below, XML presupposes that the user of a validating 
processor knows that the tool runs a validating processor. This is not 
as simple as it might sound, because we seem today to have forgotten 
that XML requires validating tools to have *two* modes: A validity 
violation mode reporting and mode were validity violation reporting is 
disabled. The choice of mode is at user option. But when reporting is 
disabled, then validating mode and non-validating mode, to the user, 
becomes more or less identical.

So we should be able to expect from tool that they tell us, before 
parsing, whether they are going to use validation mode or 
non-validation mode! 

Another reason to have both in one product is the parsing differences 
between validating and non-validating processing.[3] These difference 
prevail whether or not the validating software ”at user option” has 
been set to run with or without reporting of validity violations.[4]

Validator.w3.org has no option to disable validity violation reporting. 
This is thus a violation of the XML 1.0 requirement that validating 
violation reporting in validating processors should be ”at user 
option”. Another tool that fails that test is Xmllint. Try this:
  $ xmllint --nowarning --validate validity-violating-doc

A validating processor should be able to process this document with 
validity violation reporting disabled:

   <foo/>

(Not having that option is a disservice to validating processors.)

In order to be able to discern “no validity violation reporting” from 
“non-validation mode”, the user needs to know whether or not (s)he is 
running a validating processor. This might often be simpler to know if 
the tool at hand has only has a *single* processing mode.

I therefore don’t think that XML share the expectations that 
well-designed software being able to operate in both processing modes. 
That ”validation” (in the broad sense) today often happens *without* 
DTD, supports that view.

Relating this to my issue: I did clearly have in mind validating 
processors as such, regardless of whether the user has configured it to 
report validity violations or not. Because, after all, disabling 
DTD-based validity violation reporting should of course not cause the 
tool to switch to XSD - doing that would be to *deprive* the user of 
the choice turn validity violation on and off.

To this, from your point 4:

>     It should not be amended to
>     make any distinction between document type declaration
>     constraints and validity constraints, and it should not
>     be amended to made a special case out of any particular
>     document type declaration (e.g., an "empty DTD").

A rush to tighten a rabbit hole? It is XML - not I - who distinguish 
certain sub features of the validity feature - who discerns between 
valid per DTD and some validity constraints on the top of that. I have 
not said, however, that there should be more than a single validity 
violation reporting mode!

But we could ask: What about this document: <foo/> 
Or what about this document: <!DOCTYPE f><oo/>

For both, Xmllint only says ”no DTD found”. A single error message. Why 
does it not say that the validity constraint that the element type has 
to be declared, has been broken? If all validity constraints applied 
(for  the validity violation reporter part of the software), then there 
would be many more messages! And it would then also be non-conformance 
with XML not to not report them! (Since XML requires reporting of 
validity constraints whenever the document fulfills the DTD.)

So today’s validating processors do seem to think that some documents 
only need more than a single error message when there is no DTD. And 
this is clearly inline with XML. Tightening that hole might be to 
*change* XML.

At the same time, tool makers today knows that there might *still* be 
more to be said than simply ”there is no DTD”. And it is *then* they - 
typically silently! - make the tool shift from validating mode to 
non-validating mode.

The shift in a tool from validating processor mode to non-validating 
processor mode is clearly one that happens when the tool at hand comes 
to the conclusion that validating mode is no longer any useful.

What does *that* tell us?

It tells us that, actually, the tool (and the users) perceives this as 
a shift not from validation mode to non-validation mode, but as a shift 
from *one* validation mode, to *another*, more useful, validation mode!

It also tells us that *something* inside the tool has at the very least 
performed a pre-validation of the document.

[1] http://validator.w3.org/

[2] http://validator.w3.org/nu/

[3] http://www.w3.org/TR/REC-xml/#dt-validating
[4] http://www.w3.org/TR/REC-xml/#dt-atuseroption

-- 
leif halvard silli
Received on Saturday, 8 February 2014 13:23:56 UTC