RE: XQuery feedback

> I also suspect there are some other design mistakes here.  
> For example, the definition of the id() function makes it 
> clear that ID/IDREF values are matched without respect to the 
> default collation (which is good, since these indices are 
> typically created when the data model is loaded, without 
> respect to any query collations).  However, the rules for 
> comparing xs:ID or xs:IDREF values using any of the value 
> comparison operators fall back on the xs:string comparison 
> rules, which depend on the default collation. Similarly, 
> xs:anyURI and xs:NOTATION are compared with eq/ne using code 
> points, but with gt/ge/lt/le using collations (because then 
> they fall back on xs:string comparisons). Inconsistencies 
> like these will drive users crazy.
>
This area is very tricky, and I agree there is more work to be done.

ID and IDREF are subtypes of string, so it's very hard to give them
different comparison semantics from strings without causing all sorts of
strange effects. For example:

let $a := "z"
let $b := "Z"
let $c := xs:ID("z")
let $d := xs:ID("Z")

What should distinct-values(($a, $b, $c, $d)) return, if the default
collation is case-blind? Much though I would like equality comparison of IDs
(and NCNames, etc) to follow the strict codepoint-comparison semantics, I
don't think it can be done. The problem is that "eq" needs to be transitive:
a=b and b=c => a=c, which isn't true in the above example if you use
codepoint comparison for comparing two IDs.

The xs:anyURI has different problems, because it *isnt* a subtype of string.
Here we can certainly define that equality uses codepoint comparison (and we
can say that "<" etc throws an error). The problem we have is that you can't
compare an xs:anyURI to a string, without casting one of them to the type of
the other.

Michael Kay 

Received on Monday, 3 February 2003 06:23:00 UTC