Re: ACTION: EricP to extend < and relational ops to string, get review by Andy]

Eric Prud'hommeaux wrote:
> On Mon, Sep 26, 2005 at 05:13:11PM +0100, Seaborne, Andy wrote:
> 
>>
>>
>>Eric Prud'hommeaux wrote:
>>
>>>I've updated the sorting order section to reflect that < is
>>>responsible for all literals:
>>
>>Err - all literals?
>>
>>Surely you just can't compare some combination, say integers and strings?
>>What about literals typed with types the processor does not understand?
>>
>>I thought the step was to add < etc for xsd:strings, which enables the URI 
>>comparions by deferring to "<" on the URI string.  This seems to be aimed 
>>much more widely than that.
> 
> 
> Hmm, you are right on both counts. The action item has the word
> "string" in it and I attacked all literals. I actually recall
> discussing this with you when I was at MIT (I remember where I was
> pacing) on 30 Aug. Specifically, I recall discussing the relative
> order of numerics vs. opaque datatypes and about datatype entailment
> vs. simple entailment.
> 
> This proposal is intended to simplify the behavoir where possible by
> making the '<' operator do all the work for sorting literals. This is
> more aggressive than the SQL spec, which I believe says of the <
> operator "The declared types of the corresponding fields of the two
> <row value predicate>s shall be comparable." Further, it says of
> ordering "the applicable <comp op> is the <less than operator>." In
> short, SQL won't order anything it can't compare, and it won't compare
> strings with integers.


The "order everything" for SPARQL ordering is because:
1/ apps can slice result sets with offset/limit
2/ It avoids errors during sorting (comparing across solutions)

This does not mean that sort order need be exposed in rq23 in filters which 
are tests within a single soltion.  And an application can add a a custom 
function if need be to get at that test.

> 
> 
>>>[[ http://www.w3.org/2001/sw/DataAccess/rq23/#defn_Ordered
>>>RDF Literals are compared with the "<" operator (see the Operator
>>>Mapping Table).
>>>...
>>>  1.  (Lowest) no value assigned to the variable or expression in
>>>  this solution.
>>>  2. Blank nodes
>>>  3. IRIs
>>>  4. RDF literals
>>>]]
>>
>>What I intended was that ordering may force certain comparisons to make the 
>>result sequence more predictable but these tests would be type errors in a 
>>FILTER.  Hence the rules above.
>>
>>"""
>>5. A plain literal before an RDF literal with type xsd:string of  the same 
>>lexical form.
>>"""
>>
>>This would be still be needed as they are the same value (we could just not 
>>worry and leave rule 5 out).
> 
> 
> It seems we can make a fairly arbitrary choice about what
> functionality to put in Order vs. the < operator. My decision was
> based on what I thought might be simplest to communicate to the user,
> as well as the presumption that MySQL had some reason to allow one to
> compare everything except NULL (see attached tests to that
> effect). MySQL's ORDER behavior ('a', 1, NOW()) does not reflect the <
> operator (1, NOW(), 'a'), which strike me as unfortunate. I think we
> can do better by putting the defns all in one place.
> 
> On the other hand, it may be just crazy to compare 1 and 'a'. Will
> folks do it anyways? Do we want to restrict < and create another
> function like ORDER(A, B) that actually reflects (implementes) the
> ORDERing functionality?
> 
> Anyways, this should get a little discussion started and we can
> establish people's priorities. /me thinking that maybe he should
> draft string < string (replacing literal < literal) and add an
> ORDER operator.
> 
> 
>>>clarified the selection of operator definitions:
>>>[[
>>>When selecting the operator definition for a given set of parameters,
>>>the definition with the most specific parameters applies. For
>>>instance, when evaluating xs:integer = xs:signedInt, the definition
>>>for = with two numeric parameters applies, rather than the one with
>>>two RDF terms. The table is arranged so that upper-most viable
>>>candiate is the most specific.
>>>]]
>>>
>>>Added four Operator Table Entries like:
>>> A < B  literal literal sop:literal-less-than(A, B) 
>>> xsd:boolean
>>>
>>>Created sections for sop:literal-{less,greater}-than:
>>>[[
>>>11.2.3.0 sop:literal-less-than
>>>
>>>Returns TRUE if the first argument sorts earlier than the second
>>>argument according to this earliest rule in the sorting rules:
> 
> 
>>This seems circular - the ordering text defers to "<" where possible but 
>>the "<" operator may defer to the sort order (which isn't a total ordering 
>>anyway).
> 
> 
> Ahh, "the sorting rules:" was intended to refer the the list that
> immediately followed it (which you included in your quote), not to the
> list in 10.1 Order.
> If that were clear, would this circularity go away?

I think there should only one set of ordering rules.

I don't see how this (sop:literal-less-than) rules can be compatible with the 
extensibility of a SPARQL processor to new datatypes.  That seems to require 
that "<" is not defined for things unless the processor positively knows their 
values have that relationship.  Rule 3 in uses fn:compare fn:compare works on 
strings only so what happnes about typed literals?  Either it is forcing 
things even when the processor knows somethign about datatype romanNumeral, or 
it only applies to strings.

 Andy

> 
>>>  1. Numerics sort before xs:dateTimes.
>>>  2. xs:dateTimes sort before typed literals that are neither numeric
>>>  nor an xs:dateTime.
>>>  3. The remaining datatypes are compared with fn:compare. If the
>>>  result is -1, the first argument sorts before the second
>>>  argument. If the result is 1, the second argument sorts before the
>>>  first argument. If the result is 0, the order is determined by the
>>>  order of the lexical value of the datatypes. If fn:compare returns
>>>  -1, the first argument sorts before the second argument. If it
>>>  returns 1, the second argument sorts before the first argument. If
>>>  it returns 0, the two arguments are equivelent.
>>>
>>>11.2.3.0.5 sop:literal-greater-than
>>>
>>>Returns FALSE if the first argument sorts earlier than the second
>>>argument according to this earliest rule in the above sorting rules.
>>>]]
>>>
>>>and simplified sop:RDFterm-equal
>>>[[
>>>Returns TRUE if the two arguments are the same RDF term.
>>>]]
>>>
>>>I submit this to Andy's (and anyone else's) review.
>>
>> Andy
>>
> 
> 
> 
> ------------------------------------------------------------------------
> 
> mysql> SELECT IF (1 < 'a', 'T', 'F');
> +------------------------+
> | F                      |
> +------------------------+
> mysql> SELECT IF (1 > 'a', 'T', 'F');
> +------------------------+
> | T                      |
> +------------------------+
> mysql> SELECT IF (1 > NOW(), 'T', 'F');
> +--------------------------+
> | F                        |
> +--------------------------+
> mysql> SELECT IF (1 < NOW(), 'T', 'F');
> +--------------------------+
> | T                        |
> +--------------------------+
> mysql> SELECT IF ('a' > NOW(), 'T', 'F');
> +----------------------------+
> | T                          |
> +----------------------------+
> mysql> SELECT IF ('a' < NOW(), 'T', 'F');
> +----------------------------+
> | F                          |
> +----------------------------+
> 
> mysql> CREATE TABLE t (i INTEGER, c CHAR(1), d DATETIME);
> mysql> INSERT INTO t (i,c,d) VALUES (1, NULL, NULL);
> mysql> INSERT INTO t (i,c,d) VALUES (NULL, 'a', NULL);
> mysql> INSERT INTO t (i,c,d) VALUES (NULL, NULL, NOW());
> mysql> SELECT i,c,d FROM t ORDER BY IF(i IS NULL, IF(c IS NULL, d, c), i);
> +------+------+---------------------+
> | i    | c    | d                   |
> +------+------+---------------------+
> |    1 | NULL | NULL                |
> | NULL | NULL | 2005-09-27 18:59:10 |
> | NULL | a    | NULL                |
> +------+------+---------------------+

Received on Tuesday, 27 September 2005 10:41:05 UTC