Re: RS/RE, again (sorry)
It's becoming a little clearer. I'm starting to wonder if we're talking
apples and oranges. Your proposal seems to define an application
architecture (what components say what to what). But I don't understand
how that translates into a *language* (what is valid, invalid, and what
But in XML and SGML, the concept of "what is the parse" and "what is the
validator" are not as interesting to users as "what is the parser going to
return to the application" and "what is the validator going to report as
correct." So separating the "parser" and the "validator" is only interesting
to CS types. Although on other days I would find that fascinating, today I
only care about the language.
What do you want the specification to say about a "valid" XML document? That
it has a different definition in different situations, depending on your legacy
needs? That whitespace is allowed in #PCDATA content, or not?
What do you want the specification to say about what the parser returns to the
application? The same thing? Depends on what you want? Let the tools figure it
Thanks for furthering my education...
> >>that having all whitespace be significant still seems a reasonable
> >>way to go.
> >Can you please describe *exactly* what that means?
> >At other points, there has been discussion of having a DTD-reading "filter"
> >remove the whitespace. Which seems to imply that the former would be *valid*
> >as long as the filter is applied before the validation takes place. In this
> >case, the grove which is being validated is different from the grove that a
> >DTD-less parser would use.
> I repeat my viewpoint:
> 1) The *parser* does not use a DTD, and so creates a pGrove (to use
> Elliot's term) in which *all* non-markup charaters occur (lot's
> of psuedo-elements).
> [pGrove -> pGrove]
> 2) For pure XML *validators* of the pGrove, the following:
> would cause an error if LIST couldn't contain #PCDATA.
> [pGrove -> validator]
> 3) For XML *validators* of the pGrove that are built to support
> legacy SGML systems, the following:
> would not cause an error (ie. "normal" SGML behaviour because
> they would perform some transformation of the pGrove).
> [pGrove -> validator -> epGrove].
> I expect to see most new applications built around (1), and many
> others to use (3) to obtain the semantics they desire.
> A "parser" is something that tokenises the stream, and checks only
> the syntactic constraints imposed by the XML grammar.
> A "validator" is something that takes a pGrove, and checks that it
> comforms to the constraints imposed by the grammar as defined by a