Re: those predeclared entity refs from Tim Bray on 1997-03-25 (w3c-sgml-wg@w3.org from March 1997)

From: Tim Bray <tbray@textuality.com>
Date: Tue, 25 Mar 1997 14:13:22 -0800
To: W3C SGML Working Group <w3c-sgml-wg@w3.org>
Message-Id: <3.0.32.19970325141031.009ccb50@pop.intergate.bc.ca>

[various folks are edging towards various arcane methods to solve the
 "problem" of the Big 5 predeclared entities].

I have *got* to disagree.  The spec is totally crystalline on
this.  If you need a '<' in character data or attribute value,
use '&lt;'.  What could be clearer?  In implementing Lark, 
there is [after compilation] a state in the automaton that 
recognizes &lt; - this causes a '<' to be put in the data for
eventual delivery to the app.  Idiotically simple.

So on technical grounds there are no problems in XML.  On cultural
grounds, it is de facto the case that the use of &amp; and &lt; 
is nearly universal, which greatly promotes interoperability, and
is a fact of life we should be glad of.

If we no longer predeclare these, then my minimal XML parser
has to learn how to read and interpret entity declarations.  Kiss
the DPH goodbye.

If I want to process an XML doc with Full SGML, all I need to do
is declare the entities.  So maybe we need a rule saying that a
declaration of these entities with any other value than those
given by XML makes a document non-well-formed and thus non-XML.

 IN ALL XML DOCUMENTS, &amp; SHOULD MEAN '&' AND NOTHING ELSE, EVER.  
 IN ALL XML DOCUMENTS, &lt; SHOULD MEAN '<' AND NOTHING ELSE, EVER.  
 IN ALL XML DOCUMENTS, &gt; SHOULD MEAN '>' AND NOTHING ELSE, EVER.  
 IN ALL XML DOCUMENTS, &quot; SHOULD MEAN '"' AND NOTHING ELSE, EVER.  
 IN ALL XML DOCUMENTS, &apos; SHOULD MEAN "'" AND NOTHING ELSE, EVER.  

Can someone explain in simple terms what the problem is that is causing
us to consider these measures that will greatly increase the
difficulty of minimal XML parsing and the amount of explanation
necessary in the spec?  I think I've been paying attention, and I
certainly haven't seen anything raised that comes close to being
serious enough to justify these measures, which would simultaneously
increase complexity and decrease interoperability.

 - Tim

Received on Tuesday, 25 March 1997 17:14:32 UTC