Re: SGML and XML

At 04:16 PM 11/28/96 EST, lee@sq.com wrote:
>> Whitespace would be handled as follows:
>> 
>> 1. In element content, all whitespace is ignored.
>> 2. In data content, all whitespace is preserved.
>
>Note that without a DTD it is not possible to distinguiish element and
>data content.   Furthermore, this behaviour makes it impossible to
>write a conforming XML program that copies its input to its output unchanged,
>just as this is impossible in SGML right now -- no matter how many virgins
>you use to tempt those unicorns :-) :-)
>
>The distinction between whitespace that is returned by the parser (i.e.
>emphatically not ignored) but that is not treated as data, and whitespace
>that is treated as data, is a useful one, I think.

We agree! We can use roughly the same language as for comments:

"Comments ... are not part of the document's character data; an XML
processor may, but need not, make it possible for an application to retrieve
the text of comments. For compatibility, the string "--" (double-hyphen) may
not occur within comments."

"An XML processor may, but need not, make it possible for an application to
retrieve the collapsed or removed whitespace."

The situations are equivalent. Whitespace in element context is there for
the benefit of those reading the source, like comments.

 Paul Prescod

Received on Friday, 29 November 1996 18:16:51 UTC