W3C home > Mailing lists > Public > xmlschema-dev@w3.org > July 2003

Re: schema pattern matching (negate)

From: Henry S. Thompson <ht@cogsci.ed.ac.uk>
Date: Wed, 02 Jul 2003 18:02:09 +0100
To: Robin Berjon <robin.berjon@expway.fr>
Cc: Jeni Tennison <jeni@jenitennison.com>, xmlschema-dev@w3.org
Message-ID: <f5bfzloamlq.fsf@erasmus.inf.ed.ac.uk>

Robin Berjon <robin.berjon@expway.fr> writes:

> Henry S. Thompson wrote:
>> Robin Berjon <robin.berjon@expway.fr> writes:
>>>Similarly, I couldn't find anything in the spec to control
>>>case-sensitivity. Did I miss it or has it been overlooked? Without it
>>>it is a true pain matching case-insensitive values (barbaz becoming
>>>[bB][aA][rR][bB][aA][rR]).
>> Case insensitivity is somewhere between very difficult and incoherent
>> for Unicode, as I understand it.  Different languages have different
>> opinions about what the uppercase/lowercase correspondences are,
>> e.g. (again, allegedly -- I'm not a writing system expert) the
>> upper-case of Montréal, Canada is MONTREAL, but the upper case of
>> Montréal, France is MONTRÉAL.
>
> Case insensitivity is certainly difficult, however Unicode seems to
> have defined a behaviour, which XSLT/XPath/XQuery have apparently
> adopted:
>
>    http://www.w3.org/TR/xpath-functions/#func-upper-case
>    http://www.w3.org/TR/xpath-functions/#func-lower-case
>    http://www.unicode.org/unicode/reports/tr21/

The latter, in the last version which was available before Schema went
to REC [1], says

  "These are the default definitions to be used in the absence of
  tailoring for particular languages and environments."

I think the judgement the Schema WG made was that this was too much
work for too little utility, but I understand that opinions might
differ.

ht

[1] http://www.unicode.org/reports/tr21/tr21-5.html
-- 
  Henry S. Thompson, HCRC Language Technology Group, University of Edinburgh
                      Half-time member of W3C Team
     2 Buccleuch Place, Edinburgh EH8 9LW, SCOTLAND -- (44) 131 650-4440
	    Fax: (44) 131 650-4587, e-mail: ht@cogsci.ed.ac.uk
		     URL: http://www.ltg.ed.ac.uk/~ht/
 [mail really from me _always_ has this .sig -- mail without it is forged spam]
Received on Wednesday, 2 July 2003 13:02:24 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 11 January 2011 00:14:39 GMT