W3C home > Mailing lists > Public > www-xml-schema-comments@w3.org > April to June 2010

[Bug 10008] New: Use of Unicode blocks that no longer exist in regular expressions.

From: <bugzilla@jessica.w3.org>
Date: Thu, 24 Jun 2010 11:47:12 +0000
To: www-xml-schema-comments@w3.org
Message-ID: <bug-10008-703@http.www.w3.org/Bugs/Public/>
http://www.w3.org/Bugs/Public/show_bug.cgi?id=10008

           Summary: Use of Unicode blocks that no longer exist in regular
                    expressions.
           Product: XML Schema
           Version: 1.0/1.1 both
          Platform: PC
        OS/Version: Windows NT
            Status: NEW
          Severity: normal
          Priority: P2
         Component: Datatypes: XSD Part 2
        AssignedTo: David_E3@VERIFONE.com
        ReportedBy: oliver@cbcl.co.uk
         QAContact: www-xml-schema-comments@w3.org
                CC: cmsmcq@blackmesatech.com


Section F.1.1 states the following:

Note:  [Unicode Database] is subject to future revision. For example, the
grouping of code points into blocks might be updated. All ˇminimally
conformingˇ processors ˇmustˇ support the blocks defined in the version of
[Unicode Database] that is current at the time this specification became a W3C
Recommendation. However, implementors are encouraged to support the blocks
defined in any future version of the Unicode Standard.

Unfortunately some of these blocks no longer exist in the current Unicode
specification!  I believe the changes are limited to the following:

CombiningMarksforSymbols is now CombiningDiacriticalMarksforSymbols

Greek is now GreekandCoptic

PrivateUse has been split into three groups (we think):
PrivateUseArea, SupplementaryPrivateUseAreaA and SupplementaryPrivateUseAreaB.

The behaviour for these old group names is left a bit vague.  I suggest that
the correct behaviour should be one of the following, but this is not specified
anywhere:

1) The old block names should no longer be valid.  This is a direct
contradiction with the specification and would cause compatibility problems.

2) The old names should refer to groups in an older version of the Unicode
specification that did have them.  In particular I suggest that this should be
the version used in the Schema specification.

3) The old names should map to the equivalent groups in the newer version of
the specification.  I can't find this mapping specified anywhere, but I believe
it to be as described above (at least for the current version).

-- 
Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.
Received on Thursday, 24 June 2010 11:47:17 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 24 June 2010 11:47:18 GMT