- From: Paul Prescod <papresco@calum.csclub.uwaterloo.ca>
- Date: Wed, 18 Dec 1996 12:48:00 -0500 (EST)
- To: gtn@ebt.com (Gavin Nicol)
- Cc: w3c-sgml-wg@w3.org
> >Fine. So in your opinion, should the "validator" be part of the XML > >specification or not? > > Not part of the *language* specification. Well, XML is a language (as well as a meta-language), so I interpret this to mean: "XML should define the syntax of markup declarations and markup, but should not specify the meaning of the markup declarations or the constraints they place on the instance markup." Is that accurate? > I do think we need pGroves and validator behaviour defined though. Could you expand on that? Do you mean that you want to define what is and isn't valid, or how to "hook in" an arbitrary validator? > >If so, does it really matter what the "parser" returns? > > Yes, because it is the *foundation* of the entire application > architecture. If it is not rigourously defined such that it > is trivial to prove it correct, no other part of the system can > be known to be correct. I like foundations of stone, not sand. I don't think that languages with parsers that are of moderate complexity are "built on foundations of sand." The proposals for whitespace elimination in XML are not brain surgery: "look out for this attribute", "look out for this character", "watch for this list of tags." > >Should we specify a single standardized validation scheme or not? If we do, > >what do you propose it should say about whitespace? If we do not, how can we > >claim to be even vaguly SGML compatible? As you mentioned in your last > >message, SGML's validation scheme would be just one of an infinite number > >of equally "valid" schemes. > > I propose two: a "pure" XML validator, which does no transformation, > of pGroves and another "SGML" validator, that removes whitespace > according to "normal" SGML rules. So a document can be valid according to the SGML validator, but invalid according to the "pure" XML validator because it has whitespace in the wrong place? And both validators are "correct?" And when the same document is parsed and filtered through these two different systems, one could give a real "RE Delenda est" behaviour (for instance a browser written from scratch) and one could remove whitespace according to SGML rules. So the behaviour of the "parser" would be absolutely dead-simple (a tokenization), but the input to the formatting process would still be up in the air as far as the user is concerned. And both systems would be correct. It also isn't clear to me from your proposal above if the two that you propose are a) actually in the XML spec, or in some other spec and b) exclusively "valid." Is my foo-sep filter equally valid? Could I write up a spec for it and have my documents be "valid XML"? Or are only the two filters you propose in the actual spec? Paul Prescod
Received on Wednesday, 18 December 1996 12:48:05 UTC