Grosso whitespace proposal

It appears that (1) PaulP sent this only to me and not to the list and
(2) he meant to send this to the list.  Therefore, I am forwarding it.
Apologies if either (1) and/or (2) is incorrect--I hope I am not committing
some huge faux pas.

----- Begin Included Message -----

From papresco@csclub.uwaterloo.ca Tue Dec 17 09:05:31 1996
Date: Tue, 17 Dec 1996 09:56:04 -0500
To: paul@arbortext.com (Paul Grosso)
From: Paul Prescod <papresco@csclub.uwaterloo.ca>
Subject: Grosso whitespace proposal

At 12:26 PM 12/16/96 CST, you wrote:
>What the previous paragraph does
>seem to mean is that whatever REs XML deems to be insignificant must be
>determined to be insignificant regardless of the content model since we
>may not have the DTD.  For example, we can say (as I believe we have in
>the current XML draft) that an RE immediately following a start tag is
>insignificant thereby providing places editors and others can introduce
>record ends while still allowing all XML processors to interpret the
>results identically without requiring knowledge of the DTD.

Every proposal we've seen has glitches, and I just wanted to make this one's
explicit.

a) The first may or may not be a big deal depending on your point of view,
but that means that "well formed" XML documents cannot have a list or table
formatted as they are typically formatted, where whitespace is introduced
after the item/row end-tag. That might be a compromise we could live with. 

Is there a hack we could use to "escape" all of the whitespace up to the
next tag?

<LIST>\
     <ITEM>...</ITEM>\
     <ITEM>...</ITEM>\
     <ITEM>...</ITEM>\
</LIST>

b) I'm a little uncomfortable giving users something that *looks* like what
they are used to, but doesn't behave like it. It may well be better from a
usability standpoint (though not a marketing one) to give them something
that looks "funny".

c) Who is going to check this well-formedness constraint? SGML parsers will
happily eat the whitespace. Non-validating XML parsers will not read the DTD
and so cannot notice whitespace in element content. We would need a new kind
of parser: a validating XML parser *not* built on top of an SGML parser.
(this is technically possible, but it is just more work)

OTOH, perhaps this solution has the fewest glitches of the ones we have
looked at.

 Paul Prescod



----- End Included Message -----

Received on Tuesday, 17 December 1996 11:09:56 UTC