[Bug 1850] [F&O] how do ranges work in case-insensitive mode? from bugzilla@wiggum.w3.org on 2005-09-14 (public-qt-comments@w3.org from September 2005)

From: <bugzilla@wiggum.w3.org>
Date: Wed, 14 Sep 2005 21:28:26 +0000
To: public-qt-comments@w3.org
Cc:
Message-Id: <E1EFenq-0005xf-Hn@wiggum.w3.org>

http://www.w3.org/Bugs/Public/show_bug.cgi?id=1850





------- Additional Comments From davidc@nag.co.uk  2005-09-14 21:28 -------
Both of the recent proposals have had the example

  For example, "[A-Z]" expands to "[A-Za-z]". 

But I think that they would (both) imply

[A-Za-z&#x017F;&#x212A;]

If my understanding of the proposals (and
http://www.unicode.org/Public/UNIDATA/CaseFolding.txt)
is correct.



Both of these are listed as Common case mappings
017F; C; 0073; # LATIN SMALL LETTER LONG S
212A; C; 006B; # KELVIN SIGN

Actually I'm fairly sure that the proposals imply that
[a-z] expands to [A-Za-z&#x017F;&#x212A;]
(as toLowercase() maps KELVIN SIGN to k)

However 

in the case of  the actual example [A-Z] it depends on the intended meaning of:

   one character is considered to be a *case-variant* of another character
   if there is a default case mapping between the two characters as defined in
   section 3.13 of [The Unicode Standard]. 

There is no case mapping of KELVIN sign into the range A-Z, only into the range
a-z. However it would be pretty strange if [a-z] and [A-Z] did not denote the
same set if i is set, so perhaps a "case variant" needs to be defined such that
two characters are case variants if there are default unicode case mappings that
map the characters to the same character, so K and KELVIN SIGN would be case
variants as they both lower case to k.

Received on Wednesday, 14 September 2005 21:28:33 UTC