Re: String complements

> 
> When sorting using a collation, we must use not the codepoint for a character, but its index in the sorted characters of this collation.
> 
> This is why it is important to have a function 
> 
>     fn:collation-characters($collation-name as xs:string) as xs:string 
> 
> that returns the sorted (according to this collation) individual characters of the collation.
> 

Collations don't work character-by-character. Firstly, they split a string into "collation elements" (or collation units) which may contain several characters, and secondly, they work in a number of passes, sorting first by the primary weights of each collation unit, then the secondary weights, etc.

Enjoy some bedtime reading: http://www.unicode.org/reports/tr10/

We do offer a function fn:collation-key() which, given a string and a collation, returns a binary value such that the ordering of the binary values corresponds to the order of strings in the collation. This functionality is very important when delivering things like distinct-values or for-each-group using an arbitrary collation; it can also be used to construct keys for maps.

Michael Kay

Received on Friday, 15 March 2024 18:41:32 UTC