- From: Paul Prescod <papresco@calum.csclub.uwaterloo.ca>
- Date: Wed, 18 Dec 1996 08:34:08 -0500 (EST)
- To: gtn@ebt.com (Gavin Nicol)
- Cc: w3c-sgml-wg@w3.org
It's becoming a little clearer. I'm starting to wonder if we're talking apples and oranges. Your proposal seems to define an application architecture (what components say what to what). But I don't understand how that translates into a *language* (what is valid, invalid, and what constructs mean). But in XML and SGML, the concept of "what is the parse" and "what is the validator" are not as interesting to users as "what is the parser going to return to the application" and "what is the validator going to report as correct." So separating the "parser" and the "validator" is only interesting to CS types. Although on other days I would find that fascinating, today I only care about the language. What do you want the specification to say about a "valid" XML document? That it has a different definition in different situations, depending on your legacy needs? That whitespace is allowed in #PCDATA content, or not? What do you want the specification to say about what the parser returns to the application? The same thing? Depends on what you want? Let the tools figure it out? Thanks for furthering my education... Paul Prescod Gavin said: > >>that having all whitespace be significant still seems a reasonable > >>way to go. > > > >Can you please describe *exactly* what that means? > .... > > > >At other points, there has been discussion of having a DTD-reading "filter" > >remove the whitespace. Which seems to imply that the former would be *valid* > >as long as the filter is applied before the validation takes place. In this > >case, the grove which is being validated is different from the grove that a > >DTD-less parser would use. > > I repeat my viewpoint: > > 1) The *parser* does not use a DTD, and so creates a pGrove (to use > Elliot's term) in which *all* non-markup charaters occur (lot's > of psuedo-elements). > [pGrove -> pGrove] > 2) For pure XML *validators* of the pGrove, the following: > > <LIST> > <ITEM>foo</ITEM> > </LIST> > > would cause an error if LIST couldn't contain #PCDATA. > [pGrove -> validator] > 3) For XML *validators* of the pGrove that are built to support > legacy SGML systems, the following: > > <LIST> > <ITEM>foo</ITEM> > </LIST> > > would not cause an error (ie. "normal" SGML behaviour because > they would perform some transformation of the pGrove). > [pGrove -> validator -> epGrove]. > > I expect to see most new applications built around (1), and many > others to use (3) to obtain the semantics they desire. > > A "parser" is something that tokenises the stream, and checks only > the syntactic constraints imposed by the XML grammar. > > A "validator" is something that takes a pGrove, and checks that it > comforms to the constraints imposed by the grammar as defined by a > DTD.
Received on Wednesday, 18 December 1996 08:34:08 UTC