[Bug 5183] [FO] Effect of type promotion in fn:distinct-values

http://www.w3.org/Bugs/Public/show_bug.cgi?id=5183





------- Comment #4 from mike@saxonica.com  2007-10-24 08:13 -------
I would like to propose a solution along the following lines (detailed wording
to follow later).

1. The function partitions the items in the atomized input into a number of
groups such that:

  a. Within a group, every pair of items in the group are mutually equal (that
is, A eq B, or A and B are both NaN)

  b. Given two distinct groups, there is at least one pair of values chosen one
from each group such that the two values are unequal (A ne B unless one is NaN)

  c. Note that this does not guarantee that there is no pair of values that are
equal to each other but assigned to different groups, because of the
transitivity issue

  d. Note also that in the general case there may be more than one possible
partitioning that meets these rules.

2. The function then selects one item from each group, chosen arbitrarily,
except [discuss?] that the item that is chosen from one group must not be equal
to the item that is chosen from any other group.

I think this can be implemented by an algorithm that processes the items in
input order, that makes an immediate decision for each item whether to include
it in the result or not, and that retains in memory (a) the items that have
been returned in the output, and (b) for each value that has been returned in
the output, at most one value of each primitive data type that has not been
returned itself, but is equal to a value that has been returned.

For xsl:for-each-group, given that we guarantee the order of groups and the
order of items within a group, we could be a bit more prescriptive: we could
prescribe an algorithm that processes the items in order and allocates each one
to an existing group if it is equal to every item in that group, and that
starts a new group otherwise. This algorithm (I believe!) meets the rules for
distinct values given above, and gives a more predictable result.

Received on Wednesday, 24 October 2007 08:13:18 UTC