[Prev][Next][Index][Thread]

back to XML: The "You Babe" problem



Hi everyone; just noticed an irritating conundrum in the XML spec, which
can easily be solved in a couple of arbitrary ways, without apparent 
ill-effect; this note is just to run the problem past the group's eyes to 
make sure the obvious solution doesn't break anything.

Suppose that you wanted to insert a reference to a Unicode character whose
value is decimal 47,806 or hex 0xbabe.

The XML spec as written would allow all of the following:

 &u-babe; &U-babe;  &u-BABE; &u-bABe

which is OK at one level, and no problem for a parser (mine anyhow), but
probably not acceptable, if only for the cultural reason that SGML-folk
expect strings between '&' and ';' to be case-sensitive.

So I think we should reword the spec to require one of:
 (a) &u-babe;
 (b) &U-BABE;
 (c) &u-BABE;
 (d) &U-babe;

I cannot for the life of me see any significant advantage to any one over the
others.  If anyone sees anything worth shouting about, please do so (I really
mean, please don't) - in the event of silence, we'll just do a quickie vote
in the ERB and settle it. - Tim


Follow-Ups: