W3C home > Mailing lists > Public > public-qt-comments@w3.org > February 2004

[F&O] IBM-FO-104: Description of substring matching should account for ignorable collations units

From: Henry Zongaro <zongaro@ca.ibm.com>
Date: Tue, 17 Feb 2004 20:43:20 -0500
To: public-qt-comments@w3.org
Message-ID: <OFE2CB6873.C0E2DE5D-ON85256E3B.007B2760-85256E3E.0009763B@ca.ibm.com>

[My apologies that these comments are coming in after the end of the Last 
Call comment period.]

Section 7.5

According to the sixth paragraph of this section, "In the definitions 
below, we say that $arg1 contains $arg2 at positions m through n if the 
collation units corresponding to characters in positions m to n of $arg1 
are the same as the collation units corresponding to all the characters of 

This definition is not sufficiently precise in the presence of ignorable 
collation units. The rules should be based on 
http://www.unicode.org/unicode/reports/tr10/#Searching (e.g. minimal or 
maximal. For all positive i and j, there is no match at Q[s-i,e+j].)

For example, '-' is ignorable for some collations. It is not clear whether 
substring-before("a-b", "b") returns "a" or "a-".  This needs to be 
clearly specified.  If it is implementation-dependent or 
implementation-defined, that should be clearly specified.


[Speaking on behalf of reviewers from IBM.]
Henry Zongaro      Xalan development
IBM SWS Toronto Lab   T/L 969-6044;  Phone +1 905 413-6044
Received on Tuesday, 17 February 2004 20:43:27 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 15:45:18 UTC