RE: XQuery feedback from XQuery on 2003-02-03 (public-qt-comments@w3.org from February 2003)

From: XQuery <xquery@attbi.com>
Date: Mon, 3 Feb 2003 08:32:58 -0800
To: "'Kay, Michael'" <Michael.Kay@softwareag.com>, <public-qt-comments@w3.org>
Message-ID: <000001c2cba1$e9d74e50$6601a8c0@brundage1>

Ah, right, I forgot that uri doesn't inherit from string.  Sorry about that.

I understand why ID/IDREF are special, although I would call out this
behavior very explicitly and carefully in the spec, instead of just letting
the fact accidentally fall out of the comparison and id() definitions.
Right now it's not clear whether this is an error in the spec, or
intentionally "by design".

I think users will definitely be surprised when id() performs a
case-sensitive lookup but eq performs a case-insensitive ID/IDREF comparison
(if their collation is case-insensitive).  The typical user assumes lookup
matches the behavior of equality comparisons (something the Java and C#
class libraries reinforce in their definitions of Hashtable).

(Also, it's a little weird to special-case NOTATION, isn't it?)

-----Original Message-----
From: public-qt-comments-request@w3.org
[mailto:public-qt-comments-request@w3.org] On Behalf Of Kay, Michael
Sent: Monday, February 03, 2003 3:23 AM
To: xquery@attbi.com; public-qt-comments@w3.org
Subject: RE: XQuery feedback



> I also suspect there are some other design mistakes here.
> For example, the definition of the id() function makes it 
> clear that ID/IDREF values are matched without respect to the 
> default collation (which is good, since these indices are 
> typically created when the data model is loaded, without 
> respect to any query collations).  However, the rules for 
> comparing xs:ID or xs:IDREF values using any of the value 
> comparison operators fall back on the xs:string comparison 
> rules, which depend on the default collation. Similarly, 
> xs:anyURI and xs:NOTATION are compared with eq/ne using code 
> points, but with gt/ge/lt/le using collations (because then 
> they fall back on xs:string comparisons). Inconsistencies 
> like these will drive users crazy.
>
This area is very tricky, and I agree there is more work to be done.

ID and IDREF are subtypes of string, so it's very hard to give them
different comparison semantics from strings without causing all sorts of
strange effects. For example:

let $a := "z"
let $b := "Z"
let $c := xs:ID("z")
let $d := xs:ID("Z")

What should distinct-values(($a, $b, $c, $d)) return, if the default
collation is case-blind? Much though I would like equality comparison of IDs
(and NCNames, etc) to follow the strict codepoint-comparison semantics, I
don't think it can be done. The problem is that "eq" needs to be transitive:
a=b and b=c => a=c, which isn't true in the above example if you use
codepoint comparison for comparing two IDs.

The xs:anyURI has different problems, because it *isnt* a subtype of string.
Here we can certainly define that equality uses codepoint comparison (and we
can say that "<" etc throws an error). The problem we have is that you can't
compare an xs:anyURI to a string, without casting one of them to the type of
the other.

Michael Kay

Received on Monday, 3 February 2003 11:33:48 UTC