Re: Trying to sum up a bit from Paul Prescod on 1996-12-17 (w3c-sgml-wg@w3.org from December 1996)

From: Paul Prescod <papresco@calum.csclub.uwaterloo.ca>
Date: Tue, 17 Dec 1996 16:04:51 -0500
To: w3c-sgml-wg@w3.org
Message-Id: <1.5.4.32.19961217210451.009a481c@csclub.uwaterloo.ca>
Good summary Tim. As David has pointed out, it is good to keep evolving
these things so that we do not go back to old points. Could you please add
the answer to a couple of my questions, if you know them?

At 10:23 AM 12/17/96 -0800, Tim Bray wrote:
>On this whitespace bit, we're going to have to sum up, make some sort of 
>compromise, and move forward.  

Do we have anything to move forward *on*? Isn't the WG going to give us some
starting document or starting point for hyperlinking discussion? Or just
jump right in?

>2. Work significant WS into the definition of well-formedness (Grosso)
>
>Adopt a tight set of rules for well-formed documents such that all white 
>space, except RE's after tags, may safely assumed by a DTD-less processor
>to be significant.

It doesn't make sense for this to be part of the well-formedness
requirements. Anything that requires reference to a DTD should be a
*validity* requirement. After all, some well-formed documents don't even
have a DTD.

>Pro: the problem goes away
>Con: 1. it's a lot harder to author XML without an XML-savvy tool
>     2. you can't test well-formedness without looking at a DTD

This "con" becomes: "can't test validity without knowing where SSEP nodes
were...many existing parsers don't tell you".

>3. All non-markup bytes are signicant, whitespace or not (Durand)
>
>Pro: Everyone can understand the rules, it's easy to implement
>Con: You lose certain Hytime addressing facilities, and the application
>     gets no help from the XML processor in ignoring WS that to the user
>     is "obviously" irrelevant.

I think that in order to write stylesheets for this solution, you will end
up having to list all of the element context or all of the mixed context
elements in the stylesheet. Does anyone think that this is not the case?
Isn't this a Big Deal that should be Discussed?

>4. Use an mechanism *in the instance* to signal a DTD-less application
>   what's going on.
>
> 4.1 The PI-based DTD summary (Sperberg-McQueen)
> 4.2 Explicit quoting of significant character data (Goldfarb)
> 4.3 -XML-SPACE
>  4.3.1 -XML-SPACE with one value, PRESERVE (Paoli)
>  4.3.2 -XML-SPACE with two values, PRESERVE/COLLAPSE (Current ERB)
>  4.3.3 -XML-SPACE with three values, PRESERVE/COLLAPSE/SUPPRESS (Bray)
>        [which we might want to rename, if what we're really doing is 
>         signalling element content, mixed content, and verbatim content]

What does an -XML-PRESERVE element do when applied to element content? Give
a different parse tree for DTD-less parsers? Trigger an error in validating
parsers? Both? Neither?

Isn't significant whitespace the norm rather than the exception? Shouldn't
it be the default?

>Under (4.3), there are some behavioral options:
>
> 4.3a1. the XML processor does what -XML-SPACE says, eats the whitespace
>        or not, and consumes the attribute, not passing it to the application
> 4.3a2. the XML processor does what -XML-SPACE says *and* passes it on to
>        the application
> 4.3a3. the XML processor merely passes -XML-SPACE along to the application,
>        along with all the bytes of data; it is the application's job to
>        do the right thing

Why not just let the parser do it, if it knows what needs to be done? Since
we are not specifying a special message passing language for returning <!--
comments -->, why should we do so for insignificant whitespace? How are they
different?

 Paul Prescod
Received on Tuesday, 17 December 1996 16:02:03 UTC