RE: [XQuery 1.0 and XPath 2.0 Functions and Operators] from Kay, Michael on 2004-01-13 (public-qt-comments@w3.org from January 2004)

From: Kay, Michael <Michael.Kay@softwareag.com>
Date: Tue, 13 Jan 2004 16:57:03 +0100
To: "Igor Hersht" <igorh@ca.ibm.com>, <public-qt-comments@w3.org>
Message-ID: <37B64F4BA60E9E4FB9F60929E05980242877A8@DAEMSG03.eur.ad.sag>

A personal response, prior to any WG discussion of the comments:

> 
> SUGGESTION 1:
> 
> 7.4.7 fn:upper-case and 7.4.8 fn:lower-case
> How we can find what language to use? From default collation? 
> It is not going to be flexible if from default collation. 
> Possible solution- second optional parameter xml:lang

We made the decision to go for a language-independent mapping of
lower-case upper-case and vice-versa; we felt that providing
language-dependent mappings was outside the 80/20 cutoff. Remember that
there is plenty of provision for additional functions to be provided
outside the core library.

> 
> SUGGESTION 2:
> 
> 7.5 Functions Based on Substring Matching
> The rules are ambiguous if there are ignorable collation units.
> 
> example
> '-' is ignorable for some collations. substring-before("a-b", 
> "b") returns "a" or "a-"?
> 
> Matching rules should be more precise and based on 
> http://www.unicode.org/unicode/reports/tr10/> #Searching
> (e.g. 
> minimal or maximal.
> For all positive i and 
> j, there is no match at Q[s-i,e+j].)
> 

I don't think we want to be too prescriptive in terms of the collation
algorithms that vendors use. But I agree with you that the rules for
substring-before and substring-after could be clearer.

I would suggest that substring-before($s1, $s2) is defined as:

substring($s1, 1, $n -1) where $n is the lowest integer that satisfies
starts-with(substring($s1, $n), $s2)

And substring-after($s1, $s2) is defined as:

substring($s1, $n + string-length($s2)) where $n in the lowest integer
that satisfies starts-with(substring($s1, $n), $s2)

(these rules will need augmenting for the case where there is no match).

I think the rules for starts-with, contains, and ends-with are
unambiguous.

Michael Kay

Received on Tuesday, 13 January 2004 10:56:42 UTC