- From: Henry S. Thompson <ht@cogsci.ed.ac.uk>
- Date: Wed, 02 Jul 2003 18:02:09 +0100
- To: Robin Berjon <robin.berjon@expway.fr>
- Cc: Jeni Tennison <jeni@jenitennison.com>, xmlschema-dev@w3.org
Robin Berjon <robin.berjon@expway.fr> writes:
> Henry S. Thompson wrote:
>> Robin Berjon <robin.berjon@expway.fr> writes:
>>>Similarly, I couldn't find anything in the spec to control
>>>case-sensitivity. Did I miss it or has it been overlooked? Without it
>>>it is a true pain matching case-insensitive values (barbaz becoming
>>>[bB][aA][rR][bB][aA][rR]).
>> Case insensitivity is somewhere between very difficult and incoherent
>> for Unicode, as I understand it. Different languages have different
>> opinions about what the uppercase/lowercase correspondences are,
>> e.g. (again, allegedly -- I'm not a writing system expert) the
>> upper-case of Montréal, Canada is MONTREAL, but the upper case of
>> Montréal, France is MONTRÉAL.
>
> Case insensitivity is certainly difficult, however Unicode seems to
> have defined a behaviour, which XSLT/XPath/XQuery have apparently
> adopted:
>
> http://www.w3.org/TR/xpath-functions/#func-upper-case
> http://www.w3.org/TR/xpath-functions/#func-lower-case
> http://www.unicode.org/unicode/reports/tr21/
The latter, in the last version which was available before Schema went
to REC [1], says
"These are the default definitions to be used in the absence of
tailoring for particular languages and environments."
I think the judgement the Schema WG made was that this was too much
work for too little utility, but I understand that opinions might
differ.
ht
[1] http://www.unicode.org/reports/tr21/tr21-5.html
--
Henry S. Thompson, HCRC Language Technology Group, University of Edinburgh
Half-time member of W3C Team
2 Buccleuch Place, Edinburgh EH8 9LW, SCOTLAND -- (44) 131 650-4440
Fax: (44) 131 650-4587, e-mail: ht@cogsci.ed.ac.uk
URL: http://www.ltg.ed.ac.uk/~ht/
[mail really from me _always_ has this .sig -- mail without it is forged spam]
Received on Wednesday, 2 July 2003 13:02:24 UTC