- From: <bugzilla@wiggum.w3.org>
- Date: Tue, 27 Sep 2005 15:48:54 +0000
- To: public-qt-comments@w3.org
- Cc:
http://www.w3.org/Bugs/Public/show_bug.cgi?id=1850
ashok.malhotra@oracle.com changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution| |FIXED
------- Additional Comments From ashok.malhotra@oracle.com 2005-09-27 15:48 -------
The WGs decided on 9/27 to accept Michael Kay's proposal in comment #16. See below.
The detailed rules for the effect of the "i" flag are as follows. In these
rules, one character C2 is considered to be a *case-variant* of another
character C1 if the following XPath expression returns true, when the two
characters are considered as strings of length one, and the Unicode codepoint
collation is used:
fn:lower-case(C1) eq fn:lower-case(C2)
or
fn:upper-case(C1) eq fn:upper-case(C2)
Note that the case-variants of a character under this definition are always
single characters.
1. When a normal character (Char) is used as an atom, it represents the set
containing that character and all its case-variants. For example, the regular
expression "z" will match both "z" and "Z".
2. A character range (charRange) represents the set containing all the
characters that it would match in the absence of the "i" flag, together with
their case-variants. For example, the regular expression "[A-Z]" will match all
the letters A-Z and all the letters a-z. It will also match certain other
characters such as x212A (KELVIN SIGN), since fn:lower-case("K") is "k".
This rule applies also to a character range used in a character class
subtraction (charClassSub): thus [A-Z-[IO]] will match characters such as "A",
"B", "a", and "b", but will not match "I", "O", "i", or "o".
The rule also applies to a character range used as part of a negative character
group: thus [^Q] will match every character except "Q" and "q" (these being the
only case-variants of "Q" in Unicode).
3. A back-reference is compared using case-blind comparison: that is, each
character must either be the same as the corresponding character of the
previously matched string, or must be a case-variant of that character. For
example, the strings "Mum", "mom", "Dad", and "DUD" all match the regular
expression "([md])[aeiou]\1" when the "i" flag is used.
4. All other constructs are unaffected by the "i" flag. For example, "\p{Lu}"
continues to match upper-case letters only.
Received on Tuesday, 27 September 2005 15:52:27 UTC