- From: <bugzilla@wiggum.w3.org>
- Date: Fri, 27 Jun 2008 20:21:23 +0000
- To: www-xml-schema-comments@w3.org
http://www.w3.org/Bugs/Public/show_bug.cgi?id=5818 Summary: Unicode Database: shifting sands Product: XML Schema Version: 1.1 only Platform: PC OS/Version: Windows NT Status: NEW Severity: normal Priority: P2 Component: Datatypes: XSD Part 2 AssignedTo: cmsmcq@w3.org ReportedBy: mike@saxonica.com QAContact: www-xml-schema-comments@w3.org There is a Note in G.1.1: Note: [Unicode Database] is subject to future revision. For example, the mapping from code points to character properties might be updated. All ˇminimally conformingˇ processors ˇmustˇ support the character properties defined in the version of [Unicode Database] cited in the normative references (Normative (§K.1)). However, implementors are encouraged to support the character properties defined in any future version. I'm not sure that it is possible to do both. In Unicode 3.1, and therefore in XML Schema 1.0, the Ethiopic digits x1369-x1371 were in group Nd (and therefore matched \d). In Unicode 4.1 they have been moved to group No (so they no longer match \d). A given processor, unless it has configuration options to put this under user control -- which seems unduly onerous -- is either going to support the new version or the old. In one case, x1369 will match \d, in the other case it won't. In practice, it's quite likely to depend on which version of Java or .NET you are using. So I think we should either pin things down so processors are required to support Unicode version 4.1 and no other, or we should remove the "must" from the above note, and make it implementation-defined which version of Unicode is used. (In any case, what is a "must" doing in a Note?) Test case reS17 in the Microsoft regex test suite is relevant: its results depend on which version of Unicode you believe in. -- Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug.
Received on Friday, 27 June 2008 20:22:00 UTC