W3C home > Mailing lists > Public > xml-editor@w3.org > January to March 2001

BaseChar problem in XML 1.0?

From: Andy Heninger <heninger@us.ibm.com>
Date: Mon, 5 Feb 2001 10:08:35 -0800
To: xml-editor@w3.org
Cc: "Glenn Marcy" <gmarcy@us.ibm.com>, "Arnaud Le Hors" <lehors@us.ibm.com>
Message-ID: <OF69E57DD3.F16A6BFE-ON882569EA.0060C339@LocalDomain>
Hello XML Editors,

Here's a question that just came up regarding the definition
of allowable identifier characters in XML.

From the XML spec,

Production [85]  BaseChar includes the characters [#x2180-#x2182]. 

These are Roman Numerals
    1000    CD
    5000    (No reasonable ASCII approximation)
   10000    (No reasonable ASCII approximation)

BaseChar does not include the remaining Unicode Roman Numerals,
which encompass the range [#x2160-#x2183]

I checked with Mark Davis, and there is nothing from a
Unicode perspective that sets the three included characters
apart from the rest of the Unicode Roman Numerals.  It would
seem that they either all ought to be allowed or disallowed as
BaseChars.

Unicode's recommendations for Identifier characters allow them
all.

Something does not seem right.  Is there some logic here
that escapes me, or is it possible that the inclusion of
these characters is an editing error, or ???

 

  -- Andy Heninger,      IBM Cupertino, XML Technology Group
      heninger@us.ibm.com
Received on Monday, 5 February 2001 13:08:40 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 7 December 2009 10:59:31 GMT