XML errata

Hi Tim, Michael, or whoever else is reading this,

I've been reading the XML spec back to back and have noticed
two things for your errata list:

  1. The 4th paragraph of section 2.4 says that , in the content of elements,
     character data is any string of characters which does not contain the start-delimiter
     of any markup. Rule [14] for Character Data, however, says that character data
     cannot contain '<', '&', nor ']]>'. The latter is excluded from CDATA section,
     but should be allowed in "normal" character data, shouldn't it?

  2. The grammar of XML is ambiguous in rules [47]-[50], because the children
     content spec (XXX) can be viewed as a unary choice and a unary seq.
     Tim, in the Annotated XML, mentions that you've strived to make the grammar
     unambiguous. One language-transparent way to do so appears be:

     [47] children ::= ('('S? Name S? ')' | choice | seq) ('?' | '*' | '+')?

     [48]       cp ::= ((Name | choice | seq) ('?' | '*' | '+')?) | ('(' cp ')') 

     [49]   choice ::= '(' S? cp ( S? '|' S? cp)+ S? ')'

     [50]      seq ::= '(' S? cp ( S? ',' S? cp)+ S? ')'

     choice and seq get a '+' instead their original star, making them truely multinary;
     cp gets an extra alternative that makes it possible to have superfluous brackets
     around content particles; originally, this could have been
     done as unary choices or sequences. Finnaly, the children production gets an
     extra alternative for content models (XXX), which originally also were achieved
     as unary choices or sequences.

I've enjoyed reading the spec very much. Especially, I am much relieved at the simple way
XML handles white space. But it raises a question: Is that compatible with SGML?
I'm sure this question has come up before, so I am hoping for a 'canned' answer
that won't cause you too much trouble...

Best wishes,
Anne (currently posting from HongKong, but please reply to brueggem@in.tum.de)

Received on Friday, 22 October 1999 05:43:39 UTC