- From: <bugzilla@wiggum.w3.org>
- Date: Wed, 13 Feb 2008 16:23:51 +0000
- To: public-qt-comments@w3.org
- CC:
http://www.w3.org/Bugs/Public/show_bug.cgi?id=5251 ------- Comment #5 from mike@saxonica.com 2008-02-13 16:23 ------- The proposal was discussed at the telcon on 2008-02-12. There were two main reservations expressed: (a) the use of compare(substring(xxx)) was not equivalent to the current behaviour because the substring function operates without knowledge of a collation; thus this might find a match where the current spec would not. (See minutes for example) (b) Jim Melton felt unease about whether this was really a bug: were we sure the status quo wasn't what the WG intended? I would like to revise the proposal to make it a very simple change. In starts-with() and ends-with(), (a) change "minimal match" to "match", and (b) add the examples proposed in comment #4. The relevant definitions from Unicode are: DS2. There is a match according to C for P within Q[s,e] if and only if C generates the same sort key for P as for Q[s,e], and the offsets s and e meet the condition B. DS4. The match is minimal if for all positive i and j, there is no match at Q[s+i,e-j]. In such a case, we also say that P minimal matchs at Q[s,e]. Here C is the collation, B in our case is true so long as we are on a boundary between collation units (we don't say this very clearly, but it's the best interpretation), P is our second argument, and Q is the first argument. s and e are character positions. Note that DS4 is incorrect, is should only require one of i and j to be positive, the other can be zero. This bug has been reported and accepted. The current rule (for starts-with) requires a minimal match at the start of the arg1 string. This means that if "-" is ignorable (as it often is), then starts-with('-1', '-1') is false: the possible matches are on '-1' and '1'; the first match doesn't count because it is not minimal, and the second doesn't count because it is not at the start of the string. I find it impossible to believe that the WG intended this: it's surely a reasonable expectation that every string starts with itself and ends with itself.
Received on Wednesday, 13 February 2008 16:23:57 UTC