W3C home > Mailing lists > Public > w3c-sgml-wg@w3.org > December 1996

Re: RS/RE, again (sorry)

From: Gavin Nicol <gtn@ebt.com>
Date: Wed, 18 Dec 1996 00:04:16 -0500
Message-Id: <199612180504.AAA03261@nathaniel.ebt>
To: papresco@calum.csclub.uwaterloo.ca
CC: w3c-sgml-wg@w3.org
>>that having all whitespace be significant still seems a reasonable
>>way to go. 
>
>Can you please describe *exactly* what that means?
....
>
>At other points, there has been discussion of having a DTD-reading "filter"
>remove the whitespace. Which seems to imply that the former would be *valid*
>as long as the filter is applied before the validation takes place. In this
>case, the grove which is being validated is different from the grove that a
>DTD-less parser would use.

I repeat my viewpoint:

   1) The *parser* does not use a DTD, and so creates a pGrove (to use
      Elliot's term) in which *all* non-markup charaters occur (lot's
      of psuedo-elements). 
      [pGrove -> pGrove]
   2) For pure XML *validators* of the pGrove, the following:

          <LIST>
          <ITEM>foo</ITEM>
          </LIST>

      would cause an error if LIST couldn't contain #PCDATA.       
      [pGrove -> validator]
   3) For XML *validators* of the pGrove that are built to support
      legacy SGML systems, the following:

          <LIST>
          <ITEM>foo</ITEM>
          </LIST>

      would not cause an error (ie. "normal" SGML behaviour because
      they would perform some transformation of the pGrove).
      [pGrove -> validator -> epGrove].

I expect to see most new applications built around (1), and many
others to use (3) to obtain the semantics they desire.

A "parser" is something that tokenises the stream, and checks only
the syntactic constraints imposed by the XML grammar.

A "validator" is something that takes a pGrove, and checks that it
comforms to the constraints imposed by the grammar as defined by a
DTD.
Received on Wednesday, 18 December 1996 00:05:41 EST

This archive was generated by hypermail pre-2.1.9 : Wednesday, 24 September 2003 10:03:49 EDT