Re: Current Status of Discussion on RE/RS Handling from James Clark on 1996-09-27 (w3c-sgml-wg@w3.org from September 1996)

From: James Clark <jjc@jclark.com>
Date: Fri, 27 Sep 1996 20:14:25 +0000
To: "Christopher R. Maden" <crm@ebt.com>
Cc: w3c-sgml-wg@w3.org
Message-Id: <1.5.4.32.19960927201425.00ad0590@jclark.com>

At 20:00 26/09/96 GMT, Christopher R. Maden wrote:
>[Eliot Kimber]
>
>> I'll let other members of the ERB correct me if I'm wrong, but I
>> believe our assumption, in fact the whole purpose of this exercise,
>> is to enable the creation of *XML parsers* that are much simpler
>> than SGML parsers but define XML in such a way that it also be
>> processed as SGML without change (to the instance at least, it may
>> be necessary to create a DTD or modify the XML DTD).
>
>Absolutely.  An XML parser, implemented without reference to SGML, can
>follow these rules simply.
>
>However, by putting the rules in terms of the parser, it seems that
>it's difficult for an SGML-based XML application to be compliant with
>both XML and SGML.  Just by changing the rules from a parser rule to
>an application convention, it becomes possible for an application to
>be compliant with both SGML and XML.

If the rules about ignoring white-space are left to the XML application and
the application is free to require that those rules are not applied for
verbatim elements, then XML tools built on top of SGML parsers will be
unable to correctly process some XML documents, namely those that have
verbatim elements that include REs that are ignored according to the SGML
rules. (An application could get information from the SGML parser about the
record-ends it ignored and   attempt to undo the ignoring that was done by
the SGML parser, but that's not going to be practical in many cases.)  The
effect would be to prevent most unmodified SGML-based tools from being able
reliably to process XML documents.

I would say that would be a far worse situation for XML to be in than
requiring that a user, in verbatim text, simply replace space and newline by
entity references at the same time as they are replacing <, > and & by
entity references.

James

Received on Friday, 27 September 1996 15:20:08 UTC