- From: <bugzilla@wiggum.w3.org>
- Date: Tue, 27 Sep 2005 15:48:54 +0000
- To: public-qt-comments@w3.org
- Cc:
http://www.w3.org/Bugs/Public/show_bug.cgi?id=1850 ashok.malhotra@oracle.com changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |FIXED ------- Additional Comments From ashok.malhotra@oracle.com 2005-09-27 15:48 ------- The WGs decided on 9/27 to accept Michael Kay's proposal in comment #16. See below. The detailed rules for the effect of the "i" flag are as follows. In these rules, one character C2 is considered to be a *case-variant* of another character C1 if the following XPath expression returns true, when the two characters are considered as strings of length one, and the Unicode codepoint collation is used: fn:lower-case(C1) eq fn:lower-case(C2) or fn:upper-case(C1) eq fn:upper-case(C2) Note that the case-variants of a character under this definition are always single characters. 1. When a normal character (Char) is used as an atom, it represents the set containing that character and all its case-variants. For example, the regular expression "z" will match both "z" and "Z". 2. A character range (charRange) represents the set containing all the characters that it would match in the absence of the "i" flag, together with their case-variants. For example, the regular expression "[A-Z]" will match all the letters A-Z and all the letters a-z. It will also match certain other characters such as x212A (KELVIN SIGN), since fn:lower-case("K") is "k". This rule applies also to a character range used in a character class subtraction (charClassSub): thus [A-Z-[IO]] will match characters such as "A", "B", "a", and "b", but will not match "I", "O", "i", or "o". The rule also applies to a character range used as part of a negative character group: thus [^Q] will match every character except "Q" and "q" (these being the only case-variants of "Q" in Unicode). 3. A back-reference is compared using case-blind comparison: that is, each character must either be the same as the corresponding character of the previously matched string, or must be a case-variant of that character. For example, the strings "Mum", "mom", "Dad", and "DUD" all match the regular expression "([md])[aeiou]\1" when the "i" flag is used. 4. All other constructs are unaffected by the "i" flag. For example, "\p{Lu}" continues to match upper-case letters only.
Received on Tuesday, 27 September 2005 15:52:27 UTC