W3C home > Mailing lists > Public > w3c-sgml-wg@w3.org > October 1996

Re: RS/RE: Yet Another Proposal

From: Gavin Nicol <gtn@ebt.com>
Date: Thu, 3 Oct 1996 16:18:13 -0400
Message-Id: <199610032018.QAA25693@nathaniel.ebt>
To: papresco@calum.csclub.uwaterloo.ca
CC: crm@ebt.com, w3c-sgml-wg@w3.org
>> The newlines go into the database.  All of them.  They are part of the
>> data.
>The spaces between my table cells cannot be considered part of the True
>Data. "True data" is what I have in my head, before I sit down at the 

How can you claim to speak for every author in the world? By your
definition of true data, you, I and the fellow next door will all
disagree (I just checked).

If you don't want the whitespace, don't put it in.

>They have no meaning in ANY output format or as the result of ANY
>database query. They are meaningless in all but two contexts;
>#1. My raw text editor, where they were created.
>#2. An XML application that treats them as meaningful because it
>doesn't have a stylesheet, doesn't know what "application
>conventions" to apply and doesn't know what to do with them.

Again, if you type in anything other than the "true data", then why
would you expect an application to understand what you meant. DWIM
interfaces have failed...

>I think that any syntactic specification that subscribes behaviour for
>a _particular application_ is in trouble. You've specified a
>mechanism for  a particular class of applications to "figure out"
>what was meant, but not for the larger set of ALL applications.

This mechanism is described as an application convention and has
nothing at all to do with syntax. A syntax in which all any characters
between tag boundaries are significant is pretty rigorous to my way of
thinking. There is also another benefit to this approach: we would be
able, just from a simple ESIS tree to create an instance that would
create the exact same ESIS tree on parsing, *without* a DTD. Nice
feature for testing purposes...

>Making ALL newlines (outside of verbatim elements) NOT data is also 
>unambiguous, but preserves the SGML/HTML convention of using whitespace for
>formatting without affecting the parse tree.

Yes, but this is a *worse* hack, because now you *are* confluging syntax
and semantics. You have to be able to tell the parser "this element is

>> An SGML (or XML with DTD) parse will not be ESIS-
>> identical to an XML parse without DTD, but after application
>> conventions are applied, the result will be identical.  Isn't that
>> what matters?
>The problem is what you describe as "application conventions" are conventions
>at the level ABOVE the XML parser (as opposed to implementing XML as a set
>of SGML application conventions). So you are depending on "smart

Actually, I cannot see why a DTD would even make a difference in a
case where record boundaries don't occur, though I must confess to not
having thought it through completely.
Received on Thursday, 3 October 1996 16:19:59 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 20:25:03 UTC