- From: Christopher R. Maden <crm@ebt.com>
- Date: Thu, 26 Sep 1996 18:20:46 GMT
- To: w3c-sgml-wg@w3.org
[Eliot Kimber for the ERB] > An XML parser shall interpret white space and record ends in XML ^^^^^^ Really? If an XML application is implemented strictly, as an SGML application, then there's separation between the parser and the application, and this step should happen in the application. If the XML application is not implemented in terms of SGML, then this still seems to me to be an application convention. It's only a semantic point, really, but I think it makes a difference: > This approach also requires that truly significant record ends in > data must be escaped in some way. If the whitespace interpretation rules are phrased as an application convention, and not a change in parsing rules, it buys this: o An SGML parser can be an XML parser without any changes. o A DSSSL-based XML application can do the whitespace rules as a grove operation, after an SGML parse. o Record ends need not be escaped. The stylesheet can affect whitespace normalization behavior; if the element is to be styled "verbatim" or "preformatted", then whitespace is not normalized. o Even an SGML application can parse XML with no (or an ANY) DTD; all elements are mixed content, but the extra whitespace between elements is irrelevant, since it'll be normalized away by the application. True, this won't be ESIS-identical to a parse with the DTD, but after application-level normalization, the "true content" will come out to the same thing. The rules adopted by the ERB are good ones, but I think they should be stated as application conventions, not parsing rules, for the benefit of those implementing XML as an SGML application. -Chris -- <!NOTATION SGML.Geek PUBLIC "-//GCA//NOTATION SGML Geek//EN"> <!ENTITY crism PUBLIC "-//EBT//NONSGML Christopher R. Maden//EN" SYSTEM "<URL>http://www.ebt.com <TEL>+1.401.421.9550 <FAX>+1.401.521.2030 <USMAIL>One Richmond Square, Providence, RI 02906 USA" NDATA SGML.Geek>
Received on Thursday, 26 September 1996 14:30:10 UTC