- From: Steve Dahl <sdahl@goshawk.com>
- Date: Wed, 25 Aug 1999 22:32:04 -0400
- To: xml-editor@w3.org
I don't understand the following quotes from Appendix B. They seem to conflict with the general rules given for classifying Unicode characters. > The following characters are treated as name-start characters rather than > name characters, because the property file classifies them as Alphabetic: > [#x02BB-#x02C1], #x0559, #x06E5, #x06E6. In the Unicode databases that I can find (Unicode 1.1 through 3.0), these are all classified as Lm, which should make them name characters, not name-start characters. Which property file was used to define the XML spec? Where can I find a copy of that file? > Character #x00B7 is classified as an extender, because the property list > so identifies it. > Character #x0387 is added as a name character, because #x00B7 is its > canonical equivalent. These are both classified as Po characters in all of the Unicode character databases I could find. They re not classified as Lm, which I assume is what is meant by Extender. Therefore, it seems like they should not be classified as name characters. If these three line items are correct relative to current Unicode definitions, what is the algorithm we should use to upgrade an XML processor to Unicode 2.1? For which characters should we *not* trust the Unicode Consortioum's character database for classification? -- - Steve Dahl sdahl@goshawk.com
Received on Wednesday, 25 August 1999 22:34:02 UTC