Re: PLease define 'collation'

> A collation is an artefact that assigns a sort order

"Collation" as used here needs to do more than this. The definition of
contains() substring-before() etc needs more than just an ordering
on sequences of characters.

Xpath2 currently says (in 2.1.1, public draft)

  A collation may be regarded as an object that supports two functions: a
  function that given a set of strings, returns a sequence containing
  those strings in sorted order; and a function that given two strings,
  returns true if they are considered equal, and false if not.

but that also is not enough to support substring matching.

Specifically the definition of collation ought to include (as a
definition of a special type of collation) or forward reference to
the definition in XPath2.0 section 7.5


     All collations support the capability of deciding whether two
     strings are considered equal, and if not, which of the strings
     should be regarded as preceding the other. For functions such as
     fn:compare(), this is all that is required. For other functions,
     such as fn:contains(), the collation needs to support an additional
     property: it must be able to decompose the string into a sequence
     of collation units, each unit consisting of one or more characters,
     such that two strings can be compared by pairwise comparison of
     these units. The string $arg1 is then considered to contain $arg2
     as a substring if the sequence of collation units corresponding to
     $arg2 is a subsequence of the sequence of the collation units
     corresponding to $arg1. The characters in $arg1 that match are the
     characters corresponding to these collation units.


David

________________________________________________________________________
This e-mail has been scanned for all viruses by Star Internet. The
service is powered by MessageLabs. For more information on a proactive
anti-virus service working around the clock, around the globe, visit:
http://www.star.net.uk
________________________________________________________________________

Received on Wednesday, 2 June 2004 09:11:49 UTC