W3C home > Mailing lists > Public > w3c-sgml-wg@w3.org > December 1996

Re: RS/RE, again (sorry)

From: Paul Prescod <papresco@calum.csclub.uwaterloo.ca>
Date: Thu, 12 Dec 1996 19:17:26 -0500
Message-Id: <1.5.4.32.19961213001726.00a6fd20@csclub.uwaterloo.ca>
To: gtn@ebt.com (Gavin Nicol)
Cc: w3c-sgml-wg@w3.org
At 04:54 PM 12/12/96 -0500, Gavin Nicol wrote:
>>It is even more than that...it is the idea of having pretty-printing 
>>whitespace be something that is handled by the application *at
>>all*. That's out of whack with most people's understanding of a
>>parser's job. C++ parsers don't return "whitespace nodes" between
>>tokens that the C++ "back end" must detect and delete.
>
>Then again, we aren't talking about C++ are we? We're talking about a
>markup language for text. A language that doesn't have delimiters
>around strings... 
>
>We seem to be confusing parsing XML, and parsing the grammar defined
>by the DTD is you ask me...

But one of the important points about SGML (of which XML is a subset) is a
contract between the parser and the application: "I will not hand you data
which does not conform to the DTD." This is *central*. Without it, we can
seldom do intelligent things with documents. XML's barely acceptable
solution is to divide parsers down the middle into those that respect this
contract and those that do not. (validating and non-validating parsers) But
at least the whitespace handling of both is explicit in the XML standard.

Your solution would leave it up entirely to applications, which will (IMO)
almost inevitably lead to incompatibility.

 Paul Prescod
Received on Thursday, 12 December 1996 19:14:39 EST

This archive was generated by hypermail pre-2.1.9 : Wednesday, 24 September 2003 10:03:48 EDT