- From: Kay, Michael <Michael.Kay@softwareag.com>
- Date: Tue, 13 Jan 2004 16:57:03 +0100
- To: "Igor Hersht" <igorh@ca.ibm.com>, <public-qt-comments@w3.org>
A personal response, prior to any WG discussion of the comments:
>
> SUGGESTION 1:
>
> 7.4.7 fn:upper-case and 7.4.8 fn:lower-case
> How we can find what language to use? From default collation?
> It is not going to be flexible if from default collation.
> Possible solution- second optional parameter xml:lang
We made the decision to go for a language-independent mapping of
lower-case upper-case and vice-versa; we felt that providing
language-dependent mappings was outside the 80/20 cutoff. Remember that
there is plenty of provision for additional functions to be provided
outside the core library.
>
> SUGGESTION 2:
>
> 7.5 Functions Based on Substring Matching
> The rules are ambiguous if there are ignorable collation units.
>
> example
> '-' is ignorable for some collations. substring-before("a-b",
> "b") returns "a" or "a-"?
>
> Matching rules should be more precise and based on
> http://www.unicode.org/unicode/reports/tr10/> #Searching
> (e.g.
> minimal or maximal.
> For all positive i and
> j, there is no match at Q[s-i,e+j].)
>
I don't think we want to be too prescriptive in terms of the collation
algorithms that vendors use. But I agree with you that the rules for
substring-before and substring-after could be clearer.
I would suggest that substring-before($s1, $s2) is defined as:
substring($s1, 1, $n -1) where $n is the lowest integer that satisfies
starts-with(substring($s1, $n), $s2)
And substring-after($s1, $s2) is defined as:
substring($s1, $n + string-length($s2)) where $n in the lowest integer
that satisfies starts-with(substring($s1, $n), $s2)
(these rules will need augmenting for the case where there is no match).
I think the rules for starts-with, contains, and ends-with are
unambiguous.
Michael Kay
Received on Tuesday, 13 January 2004 10:56:42 UTC