W3C home > Mailing lists > Public > xml-editor@w3.org > July to September 1999

Errata in Appendix B

From: Steve Dahl <sdahl@goshawk.com>
Date: Wed, 25 Aug 1999 22:32:04 -0400
Message-ID: <37C4A723.AE03E731@goshawk.com>
To: xml-editor@w3.org
I don't understand the following quotes from Appendix B. They seem to
conflict with the general rules given for classifying Unicode

> The following characters are treated as name-start characters rather
> name characters, because the property file classifies them as
> [#x02BB-#x02C1], #x0559, #x06E5, #x06E6.

In the Unicode databases that I can find (Unicode 1.1 through 3.0),
these are all classified as Lm, which should make them name characters,
not name-start characters. Which property file was used to define the
XML spec? Where can I find a copy of that file?

> Character #x00B7 is classified as an extender, because the property
> so identifies it.

> Character #x0387 is added as a name character, because #x00B7 is its
> canonical equivalent.

These are both classified as Po characters in all of the Unicode
character databases I could find. They re not classified as Lm, which I
assume is what is meant by Extender. Therefore, it seems like they
should not be classified as name characters.

If these three line items are correct relative to current Unicode
definitions, what is the algorithm we should use to upgrade an XML
processor to Unicode 2.1? For which characters should we *not* trust the
Unicode Consortioum's character database for classification?

- Steve Dahl
Received on Wednesday, 25 August 1999 22:34:02 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 20:37:39 UTC