- From: Tim Bray <tbray@textuality.com>
- Date: Fri, 22 Oct 1999 09:42:05 -0700
- To: anne brueggemann-klein <annebk@cs.ust.hk>, xml-editor@w3.org
- Cc: dwood@cs.ust.hk (Prof. Derick Wood), brueggem@in.tum.de
At 05:43 PM 10/22/99 +0800, anne brueggemann-klein wrote: >Hi Tim, Michael, or whoever else is reading this, Hi Anne, Hi Derick! > 1. The 4th paragraph of section 2.4 says that , in the content of elements, > character data is any string of characters which does not contain the start-delimiter > of any markup. Rule [14] for Character Data, however, says that character data > cannot contain '<', '&', nor ']]>'. The latter is excluded from CDATA section, > but should be allowed in "normal" character data, shouldn't it? No, it's explicitly excluded. I'm pretty sure we inherited this from SGML and had to have it to retain compatibility. It's also generally good practice (unlike some other things we inherited from SGML). > 2. The grammar of XML is ambiguous in rules [47]-[50], because the children > content spec (XXX) can be viewed as a unary choice and a unary seq. Right. This is a known erratum and will show up when the list is updated. >I've enjoyed reading the spec very much. Especially, I am much relieved at the simple way >XML handles white space. But it raises a question: Is that compatible with SGML? >I'm sure this question has come up before, so I am hoping for a 'canned' answer >that won't cause you too much trouble... That's a good question, and I think we're OK because it turns out that nobody understands what SGML's rules really are. E.g. James Clark and Charles Goldfarb have been known to disagree. Charles says "white space caused by markup is ignored" but it is devilishly difficult to write down in a comprehensible, computable way what "caused by markup" means (we think they really meant "caused by prettyprinting" but that's not what it said). The XML committee tried to find some simple rules but couldn't. So we officially concluded that no white space was in any meaningful sense "caused by markup" and we claim we're conforming to 8879. Charles and the SGML community grumbled a bit initially but have reconciled themselves, among other things at least people can understand the XML rules! -Tim
Received on Friday, 22 October 1999 12:42:40 UTC