RE: [Bug 1753] The sorting process from Michael Kay on 2005-07-28 (public-qt-comments@w3.org from July 2005)

From: Michael Kay <mhk@mhk.me.uk>
Date: Fri, 29 Jul 2005 00:24:31 +0100
To: "'Kenneth Stephen'" <marvin.the.cynical.robot@gmail.com>, <public-qt-comments@w3.org>
Cc: <bugzilla@wiggum.w3.org>
Message-ID: <E1DyHkD-00056y-Km@maggie.w3.org>
Bug 1753 was about edge cases in sorting caused by the fact that the
different numeric types support different precision: it has nothing to do
with sorting of character strings.

I'm surprised that you could have missed the considerable attention that
XSLT 2.0 and XPath 2.0 give to the use of collations other than Unicode
codepoint collation.

>From an XSLT perspective you will find an account of the subject at

http://www.w3.org/TR/xslt20/#collating-sequences

which in turn refers to a section in Functions and Operators:

http://www.w3.org/TR/xpath-functions/#string-compare

The general philosophy is to allow any number of different collations, and
to identify collations by URI. Initially we expect that each vendor will
define their own URIs to refer to collations that they provide, or they may
allow users to define "alias URIs" to provide interoperability. However, the
definition and naming of specific collations is outside our scope.

For the approach taken by one specific vendor, see

http://www.saxonica.com/documentation/conformance/collation-uri.html

Michael Kay
 

> -----Original Message-----
> From: public-qt-comments-request@w3.org 
> [mailto:public-qt-comments-request@w3.org] On Behalf Of 
> Kenneth Stephen
> Sent: 28 July 2005 23:38
> To: public-qt-comments@w3.org
> Cc: bugzilla@wiggum.w3.org
> Subject: Re: [Bug 1753] The sorting process
> 
> 
> Hi,
> 
>     I confess, I'm a bit confused at the definition of the behaviour
> of the comparison operators  being implied by this bug and the XPath
> spec. In general, determination of the collation sequence of Unicode
> data is very complex. See http://www.unicode.org/reports/tr10/ for
> details. In particular, I call your attention to
> http://www.unicode.org/reports/tr10/#Common_Misperceptions - none of
> which seem to be addressed by the XPath spec. My impression from
> reading the XPath spec is that comparison order is defined by Unicode
> codepoint order - but this is not useful for languages other than
> English .
> 
>     Am I missing something? Or is it expected that XPath
> implementations wiil implement the Unicode collation algorithm under
> the covers and that the various options that are possible (for example
> ignoring punctuation, or case sensitivity) are going to be
> implementation defined?
> 
> Thanks,
> Kenneth
> 
> On 7/28/05, bugzilla@wiggum.w3.org <bugzilla@wiggum.w3.org> wrote:
> > 
> > http://www.w3.org/Bugs/Public/show_bug.cgi?id=1753
> > 
> > 
> > mike@saxonica.com changed:
> > 
> >            What    |Removed                     |Added
> > 
> --------------------------------------------------------------
> --------------
> >              Status|RESOLVED                    |CLOSED
> > 
> > 
> > 
> > 
> > ------- Additional Comments From mike@saxonica.com  
> 2005-07-28 21:37 -------
> > Change now applied to the editor's draft. The relevant para 
> now reads (changes
> > marked at="Y"):
> > 
> > <p>In general, comparison of two ordinary values is
> >  performed according to the rules of the
> >  XPath <code>lt</code> operator. <phrase diff="add" 
> at="Y">To ensure a total
> > ordering, the same
> >  implementation of the
> >  <code>lt</code> operator <rfc2119>must</rfc2119> be used 
> for all the
> > comparisons: the one that is chosen
> >  is the one appropriate to the most specific type to which 
> all the values can be
> > converted by subtype substitution
> >  and/or type promotion. For example, if the sequence contains both
> > <code>xs:decimal</code> and <code>xs:double</code>
> >  values, then the values are compared using 
> <code>xs:double</code> comparison,
> > even when comparing two
> >  <code>xs:decimal</code> values.</phrase>
> >  NaN values, for sorting purposes, are considered to be 
> equal to each other,
> >  and less than any other numeric value. Special rules
> >   also apply to the <code>xs:string</code> type
> >   and types derived by restriction from <code>xs:string</code>,
> >   as described in the next section.</p>
> > 
> >
> 
>
Received on Thursday, 28 July 2005 23:24:58 UTC