- From: Robin Berjon <robin.berjon@expway.fr>
- Date: Wed, 02 Jul 2003 18:18:14 +0200
- To: "Henry S. Thompson" <ht@cogsci.ed.ac.uk>
- Cc: Jeni Tennison <jeni@jenitennison.com>, xmlschema-dev@w3.org
Henry S. Thompson wrote: > Robin Berjon <robin.berjon@expway.fr> writes: >>Similarly, I couldn't find anything in the spec to control >>case-sensitivity. Did I miss it or has it been overlooked? Without it >>it is a true pain matching case-insensitive values (barbaz becoming >>[bB][aA][rR][bB][aA][rR]). > > Case insensitivity is somewhere between very difficult and incoherent > for Unicode, as I understand it. Different languages have different > opinions about what the uppercase/lowercase correspondences are, > e.g. (again, allegedly -- I'm not a writing system expert) the > upper-case of Montréal, Canada is MONTREAL, but the upper case of > Montréal, France is MONTRÉAL. Case insensitivity is certainly difficult, however Unicode seems to have defined a behaviour, which XSLT/XPath/XQuery have apparently adopted: http://www.w3.org/TR/xpath-functions/#func-upper-case http://www.w3.org/TR/xpath-functions/#func-lower-case http://www.unicode.org/unicode/reports/tr21/ >>While on this topic, I'd like to point out that a lot of literature >>out there states that XML Schema borrowed Perl's patterns, sometimes >>saying that it added Unicode support. That's fairly untrue: 1) Perl's >>patterns include full Unicode support, and 2) XML Schema uses a small >>subset of them. > > Um, I _believe_ that the fact is that we took the regexps directly > from Unicode -- the REC says: That was my understanding as well, thanks for clarifying. I wish the stuff that can be read on XML Schema were more precise (not that it's easy but still). -- Robin Berjon <robin.berjon@expway.fr> Research Engineer, Expway http://expway.fr/ 7FC0 6F5F D864 EFB8 08CE 8E74 58E6 D5DB 4889 2488
Received on Wednesday, 2 July 2003 12:19:04 UTC