- From: C. M. Sperberg-McQueen <cmsmcq@blackmesatech.com>
- Date: Tue, 28 Jan 2014 14:49:46 -0700
- To: John Cowan <cowan@mercury.ccil.org>
- Cc: "C. M. Sperberg-McQueen" <cmsmcq@blackmesatech.com>, "Henry S. Thompson" <ht@inf.ed.ac.uk>, Paul Grosso <paul@paulgrosso.name>, core <public-xml-core-wg@w3.org>, Tim Bray <twbray@google.com>, Jean Paoli <jeanpa@microsoft.com>
Summary: Henry asks: > But I'm curious what the original authors thought they were asking > for a parser to do, when invoked in validating mode, on a > well-formed document with no document type definition. Memory is tricky, but my recollection is that the original authors of the spec (among whom I would count not just the three people cc'd on Henry's message but all members of the WG and ERB) were trying as hard as we could not to define *processing*, but to define a declarative data format suitable for various kinds of processing. The authors perhaps had expectations about what behavior would be desirable in the situation you describe -- but since almost none of us had ever worked with systems of descriptive markup in which DTDs were optional, I don't know how useful those would have been. I guess that if someone had asked, I would have wanted a validating processor to behave pretty much like an SGML processor in that situation. But that doesn't tell us whether the error in that case is a violation of a validity constraint or an exception indicating that the requested action cannot be performed because some prerequisites for the action are missing. For what it's worth, my linguistic instinct is with Henry here, in that if someone tells me the XML document "<doc/>" is invalid, I am more likely to ask "against what schema?" or "what are you talking about?" than to agree or disagree. The term 'invalid' doesn't seem to me to apply. So if the question is "should that be an error or elicit a message of some kind?" the answer is yes, of course. And if it's "should it be classified a validity error?" my answer is "how can it be a validity error? No validation can have taken place." I believe the situation is analogous to that applying for XML documents with broken encoding declarations that render parsing impossible -- it's not defined as a well-formedness error, because it prevents well-formedness from being tested properly. But that doesn't mean documents in EBCDIC which claim to be in ISO 8859-7 are well formed. Some further comments on individual points are appended below for those who seek respite from whatever it is they ought to be doing right now. On Jan 28, 2014, at 12:24 PM, John Cowan wrote: > Henry S. Thompson scripsit: >> "The present king of france is bald" >> is not true, but not that it's false, or untrue. > Whereas I hold with Quine and others that presupposition-failure > sentences are just false. Classing them as false is a convenient way to simplify one's life as someone responsible for having an answer for everything (and in particular, responsible for producing a Boolean value for arbitrary sentences), but it also gives the account of truth and falsehood based on it a certain artificiality -- even worse than the mismatch between English 'if' and the material implication of logic. It may be the case that in first-order propositional calculus all sentences are true or false; it's not a plausible claim for English, however, even for sentences which appear declarative on the surface. > But even waiving that, I cannot see that a definition of the form > "g(x) is true if there exists a y and f(y,x) is true" involves a > presupposition at all. I believe Henry's point is that on his view (which in this question is very similar to mine), a document is valid if it has a document type definition and accords with the constraints expressed in that document type definition, and a document is *invalid* if it has a document type definition and violates some constraint in that document type definition. On this view, having a DTD is a presupposition for either the predicate valid(x) or the predicate invalid(x). Your not seeing any presupposition appears to be just another way of saying you believe that "invalid" means "not valid". To say that a document without a DTD is invalid, without first carefully defining one's terms, will strike some hearers and readers (me among them, for the little that's worth) not so much as right or wrong but simply as bizarre -- it is very similar, in this way, to any statement confidently ascribing this or that property to some object the speaker assumes must exist, and which I incline to believe does not exist. No one of sound mind and normal knowledge of the world will respond to a claim that the current king of France is bald with a "yes that's right" or a simple "not so" -- either response would violate the normal rules of conversational implicature. The only plausible response to the claim would be an inquiry as to what the speaker thinks they are talking about, or more brusquely a statement that there is no current king of France. Perhaps instead we might reply: "Not so: the current king of France is not bald; the current king of France does not exist!", then we seem to placing the existence of the current king of France on a par with his hirsuteness. But as an ally of Quine, you must surely be aware that existence [and by the same token, non-existence] is not a predicate. If someone claims that a given document is invalid, and we see that it has no DTD (without first carefully defining the term "invalid" to mean something slightly different from what I think it normally means), I think the natural response would be "against what DTD?", or "against what schema?" -- or more generally "what are you talking about?" On the topic of presuppositions: let us assume that your left shoelace is not a document with a document type declaration whose constraints it satisfies. Does it seem natural to you to say that your shoelace is invalid? Perhaps so. Or perhaps it's more natural to say that the predicates valid and invalid don't apply to shoelaces. (If we have to force them to have some Boolean value, we can translate "is valid" to "is a document and has a schema and conforms to that schema", in which case "your shoelace is valid" and "your shoelace is invalid" will both be false. Part of the problem is that it is not really meaningful to say that a document is valid without without reference to some specific document type definition (which I will abbreviate in what follows as "schema"). When we say "document D is valid", I believe we are using a short form for an utterance that in fuller form would be "document D is valid against schema S", which we can do whenever the identify of S is clear from context (as it will be if document D has a document type declaration). I think the same holds for "invalid", which for my linguistic instincts definitely means "violates some constraint imposed by schema S". > I also don't think that changing "if" to "iff" as Liam suggests will > help here either: in definitions, we usually treat "if" as "iff" > anyway. An object is a natural number if it is either zero or the > successor of a natural number; we don't normally bother to add that > nothing else is a natural number. What you mean 'we', white man? I would be very surprised if a competent mathematician seeking to give a definition of the natural numbers, failed to ensure that nothing else is a natural number. There are various ways of doing so: an explicit statement "nothing else is a natural number" is one way; defining the naturals as "the smallest set for which the following properties hold" is another, and I believe I remember learning that there is at least one other formulation which I am blanking on at the moment. And as a reader of definitions in specs, I don't take 'if' to mean 'iff'. When there is any doubt, I take it to introduce a sufficient but not a necessary condition, and when there is no doubt I lower my opinion of the editor by a notch. You are certainly right that some editors do treat 'if' as if it meant 'iff'. That doesn't make them right; formal specification is not the place for descriptivism. [I express no opinion on the particular text involved; I am objecting to the claim that if = iff and that the natural numbers can usefully be defined in a way that includes John's left shoelace. ... >> I note, against my preference, something I've always been perplexed >> by (at least I'm consistent): There are only three possible >> categories allowed for a test in the metadata of the XML Test Suite >> [3]: >> valid >> invalid >> not-wf ... > I think this is based on "invalid" = "not valid" synonymy. I agree that that would seem to be its basis. I think the trichotomy of XSD (valid, invalid, not known) is more plausible. -- **************************************************************** * C. M. Sperberg-McQueen, Black Mesa Technologies LLC * http://www.blackmesatech.com * http://cmsmcq.com/mib * http://balisage.net ****************************************************************
Received on Tuesday, 28 January 2014 21:50:12 UTC