RE: [XQuery 1.0 and XPath 2.0 Functions and Operators] from Igor Hersht on 2004-01-13 (public-qt-comments@w3.org from January 2004)

From: Igor Hersht <igorh@ca.ibm.com>
Date: Tue, 13 Jan 2004 15:30:48 -0500
To: "Kay, Michael" <Michael.Kay@softwareag.com>
Cc: public-qt-comments@w3.org
Message-ID: <OF51569120.3533685E-ON85256E1A.00709AE5@ca.ibm.com>
Thank for your response. It makes perfect sense .
I just have a different opinion on some issues.
(It is just my private opinion).

>We made the decision to go for a language-independent mapping of
>lower-case upper-case and vice-versa.
Then one has to remove notes from
7.4.7 fn:upper-case and 7.4.8 fn:lower-case where you are saying
explicitly that fn:upper-case and 7.4.8 fn:lower-case are language
dependent.
You would need fn:upper-case fn:lower-case functionality
for other staff (e.g. xsl:number) anyway.

I don't also understand why we cannot permit
people  to have richer (i18n friendly) functionality. (It could be just
permission
set of languages could be implementation defend and a "default" could
a language independent one)

> I don't think we want to be too prescriptive in terms of the
>collation algorithms that vendors use.

I would agree if it would be difficult to implement .
Otherwise unambiguous implementable specs is a good idea.


> would suggest that substring-before($s1, $s2) is defined as:
...
I don't understand the rational of using XSLT specific definitions
for string matching if it has already has been defined by common
Unicode specs.

>I think the rules for starts-with, contains, and ends-with are
>unambiguous
I think they are ambiguous  starts-with("-a", " a") true a false?


Igor Hersht
XSLT Development
IBM Canada Ltd., 8200 Warden Avenue, Markham, Ontario L6G 1C7
Office D2-260, Phone (905)413-3240 ; FAX  (905)413-4839


                                                                                                      
                      "Kay, Michael"                                                                  
                      <Michael.Kay@soft        To:       Igor Hersht/Toronto/IBM@IBMCA,               
                      wareag.com>               <public-qt-comments@w3.org>                           
                                               cc:                                                    
                      01/13/2004 10:57         Subject:  RE: [XQuery 1.0 and XPath 2.0 Functions and  
                      AM                        Operators]                                            
                                                                                                      
                                                                                                      
                                                                                                      



A personal response, prior to any WG discussion of the comments:

>
> SUGGESTION 1:
>
> 7.4.7 fn:upper-case and 7.4.8 fn:lower-case
> How we can find what language to use? From default collation?
> It is not going to be flexible if from default collation.
> Possible solution- second optional parameter xml:lang

We made the decision to go for a language-independent mapping of
lower-case upper-case and vice-versa; we felt that providing
language-dependent mappings was outside the 80/20 cutoff. Remember that
there is plenty of provision for additional functions to be provided
outside the core library.

>
> SUGGESTION 2:
>
> 7.5 Functions Based on Substring Matching
> The rules are ambiguous if there are ignorable collation units.
>
> example
> '-' is ignorable for some collations. substring-before("a-b",
> "b") returns "a" or "a-"?
>
> Matching rules should be more precise and based on
> http://www.unicode.org/unicode/reports/tr10/> #Searching
> (e.g.
> minimal or maximal.
> For all positive i and
> j, there is no match at Q[s-i,e+j].)
>

I don't think we want to be too prescriptive in terms of the collation
algorithms that vendors use. But I agree with you that the rules for
substring-before and substring-after could be clearer.

I would suggest that substring-before($s1, $s2) is defined as:

substring($s1, 1, $n -1) where $n is the lowest integer that satisfies
starts-with(substring($s1, $n), $s2)

And substring-after($s1, $s2) is defined as:

substring($s1, $n + string-length($s2)) where $n in the lowest integer
that satisfies starts-with(substring($s1, $n), $s2)

(these rules will need augmenting for the case where there is no match).

I think the rules for starts-with, contains, and ends-with are
unambiguous.

Michael Kay
Received on Tuesday, 13 January 2004 15:35:39 UTC