Re: XML errata

At 05:43 PM 10/22/99 +0800, anne brueggemann-klein wrote:
>Hi Tim, Michael, or whoever else is reading this,

Hi Anne, Hi Derick!

>  1. The 4th paragraph of section 2.4 says that , in the content of elements,
>     character data is any string of characters which does not contain the start-delimiter
>     of any markup. Rule [14] for Character Data, however, says that character data
>     cannot contain '<', '&', nor ']]>'. The latter is excluded from CDATA section,
>     but should be allowed in "normal" character data, shouldn't it?

No, it's explicitly excluded.  I'm pretty sure we inherited this from SGML
and had to have it to retain compatibility.  It's also generally good 
practice (unlike some other things we inherited from SGML).

>  2. The grammar of XML is ambiguous in rules [47]-[50], because the children
>     content spec (XXX) can be viewed as a unary choice and a unary seq.

Right.  This is a known erratum and will show up when the list is
updated.

>I've enjoyed reading the spec very much. Especially, I am much relieved at the simple way
>XML handles white space. But it raises a question: Is that compatible with SGML?
>I'm sure this question has come up before, so I am hoping for a 'canned' answer
>that won't cause you too much trouble...

That's a good question, and I think we're OK because it turns out that
nobody understands what SGML's rules really are.  E.g. James Clark and
Charles Goldfarb have been known to disagree.  Charles says "white
space caused by markup is ignored" but it is devilishly difficult to
write down in a comprehensible, computable way what "caused by markup"
means (we think they really meant "caused by prettyprinting" but that's
not what it said).  The XML committee tried to find some simple rules but 
couldn't.  So we officially concluded that no white space was in any 
meaningful sense "caused by markup" and we claim we're conforming to 8879.  
Charles and the SGML community grumbled a bit initially but have reconciled 
themselves, among other things at least people can understand the XML rules! 
-Tim

Received on Friday, 22 October 1999 12:42:40 UTC