Sort keys and selecting nodes

Hi,

Meditating on the wonders of the separator attribute of xsl:value-of
(which I think is a work of genius :) made me think about the other
select attribute that used to take a string-expression, namely the
select attribute on xsl:sort.

The rules for calculating the value of the sort key are addressed in
the following two paragraphs in Section 12.1 (The xsl:sort Element):

 "If the xsl:sort element has a data-type attribute, then the sort
  key is converted to the target data type before comparing it with
  other items. [ERR064] The target data type for each xsl:sort element
  is determined by the effective value of its data-type attribute. If
  this has the value text, the target data type is xsd:string. If it
  has the value number, the target data type is xsd:double. Otherwise,
  the target data type must be the name of a primitive data type in
  XML Schema (see [XML Schema]). It is a dynamic error if any other
  value is supplied. The processor must either signal the error, or
  must recover by continuing as if the data-type attribute were not
  specified. Each sort key is converted to the target data type using
  the rules for the XPath cast expression. [ERR065] It is a dynamic
  error if any value obtained by evaluating the select attribute of an
  xsl:sort element cannot be converted to the target data type. The
  processor must either signal the error, or must recover by treating
  the value as being less than any value for which conversion
  succeeds, but equal to any other value for which conversion fails.
  This means that values that cannot be converted to the target data
  type will appear together at the start of the sorted sequence if
  order is ascending, or at the end if order is descending.

 "If there is no data-type attribute, then the computed sort keys are
  not converted before comparison, except in the case where the data
  type of a computed sort key is a complex type, in which case it is
  converted to a string as if by the XPath string function."
             http://www.w3.org/TR/xslt20/#section-The-xsl:sort-Element

I don't think that it's particularly clear from this description what
happens if the select attribute evaluates to a sequence of more than
one simple typed value or to a node or node sequence. Possibly this is
because the sort key is described as being converted to the target
data type as if by the XPath cast expression, rather than in terms of
a required type and the basic conversion rules.

I think it might be clearer if you said something along the lines of:

  The sort key is converted to a required type according to the rules
  in XPath 2.0 [Section 2.1.2 Type Conversion].

  The required type for the sort key is determined by the data-type
  attribute. The required type is xs:string if the data-type attribute
  is missing or has the value 'text', xs:double if the data-type
  attribute has the value 'number', and otherwise the type specified
  in the data-type attribute.

  It is a dynamic error if the type conversion results in an error,
  and the processor may recover by ... (as above).

Or possibly if you said something before these paragraphs about what
happens if the expression in the select attribute evaluates to a
sequence (takes the first value?) or to a node (takes its typed
value?). You state that the typed value of the item in the initial
sequence is taken if the select attribute is missing, but that doesn't
help tell me what happens if the select attribute is present with the
value '.'.


Another point here is that I think that banning the data-type
attribute from taking QNames other than those naming the primitive
types of XML Schema is a source of backwards incompatibility, but it
isn't listed as such in Appendix J.1.1.

In XSLT 1.0, the data-type attribute could contain any QName (and not
NCName), which enabled people to write custom collating sequences
(e.g. with Saxon) and point to the sequence through the data-type
attribute. As it stands currently, that won't be permitted any longer.

My understanding is that people should use the collation attribute to
point to custom collating sequences instead, so I don't think it's a
problem, just something that should be listed in Appendix J.1.1.

(Having said that, it would be nice if you could point to derived
simple types in the data-type attribute, as well as the primitive
ones, but perhaps that's not allowed so that otherwise
non-schema-aware processors can nevertheless sort by date, for
example?)

Cheers,

Jeni
---
Jeni Tennison
http://www.jenitennison.com/

Received on Friday, 18 January 2002 10:16:55 UTC