Re: ACTION: EricP to extend < and relational ops to string, get review by Andy] from Eric Prud'hommeaux on 2005-09-27 (public-rdf-dawg@w3.org from July to September 2005)

From: Eric Prud'hommeaux <eric@w3.org>
Date: Tue, 27 Sep 2005 08:14:25 -0400
To: "Seaborne, Andy" <andy.seaborne@hp.com>
Cc: public-rdf-dawg@w3.org
Message-ID: <20050927121425.GL30380@w3.org>
On Tue, Sep 27, 2005 at 11:40:46AM +0100, Seaborne, Andy wrote:
> 
> 
> Eric Prud'hommeaux wrote:
> >On Mon, Sep 26, 2005 at 05:13:11PM +0100, Seaborne, Andy wrote:
> >
> >>
> >>
> >>Eric Prud'hommeaux wrote:
> >>
> >>>I've updated the sorting order section to reflect that < is
> >>>responsible for all literals:
> >>
> >>Err - all literals?
> >>
> >>Surely you just can't compare some combination, say integers and strings?
> >>What about literals typed with types the processor does not understand?
> >>
> >>I thought the step was to add < etc for xsd:strings, which enables the 
> >>URI comparions by deferring to "<" on the URI string.  This seems to be 
> >>aimed much more widely than that.
> >
> >
> >Hmm, you are right on both counts. The action item has the word
> >"string" in it and I attacked all literals. I actually recall
> >discussing this with you when I was at MIT (I remember where I was
> >pacing) on 30 Aug. Specifically, I recall discussing the relative
> >order of numerics vs. opaque datatypes and about datatype entailment
> >vs. simple entailment.
> >
> >This proposal is intended to simplify the behavoir where possible by
> >making the '<' operator do all the work for sorting literals. This is
> >more aggressive than the SQL spec, which I believe says of the <
> >operator "The declared types of the corresponding fields of the two
> ><row value predicate>s shall be comparable." Further, it says of
> >ordering "the applicable <comp op> is the <less than operator>." In
> >short, SQL won't order anything it can't compare, and it won't compare
> >strings with integers.
> 
> 
> The "order everything" for SPARQL ordering is because:
> 1/ apps can slice result sets with offset/limit
> 2/ It avoids errors during sorting (comparing across solutions)
> 
> This does not mean that sort order need be exposed in rq23 in filters which 
> are tests within a single soltion.  And an application can add a a custom 
> function if need be to get at that test.

fair enough. i've switched to just odering xs:strings

> >
> >
> >>>[[ http://www.w3.org/2001/sw/DataAccess/rq23/#defn_Ordered
> >>>RDF Literals are compared with the "<" operator (see the Operator
> >>>Mapping Table).
> >>>...
> >>> 1.  (Lowest) no value assigned to the variable or expression in
> >>> this solution.
> >>> 2. Blank nodes
> >>> 3. IRIs
> >>> 4. RDF literals
> >>>]]
> >>
> >>What I intended was that ordering may force certain comparisons to make 
> >>the result sequence more predictable but these tests would be type errors 
> >>in a FILTER.  Hence the rules above.
> >>
> >>"""
> >>5. A plain literal before an RDF literal with type xsd:string of  the 
> >>same lexical form.
> >>"""
> >>
> >>This would be still be needed as they are the same value (we could just 
> >>not worry and leave rule 5 out).
> >
> >
> >It seems we can make a fairly arbitrary choice about what
> >functionality to put in Order vs. the < operator. My decision was
> >based on what I thought might be simplest to communicate to the user,
> >as well as the presumption that MySQL had some reason to allow one to
> >compare everything except NULL (see attached tests to that
> >effect). MySQL's ORDER behavior ('a', 1, NOW()) does not reflect the <
> >operator (1, NOW(), 'a'), which strike me as unfortunate. I think we
> >can do better by putting the defns all in one place.
> >
> >On the other hand, it may be just crazy to compare 1 and 'a'. Will
> >folks do it anyways? Do we want to restrict < and create another
> >function like ORDER(A, B) that actually reflects (implementes) the
> >ORDERing functionality?
> >
> >Anyways, this should get a little discussion started and we can
> >establish people's priorities. /me thinking that maybe he should
> >draft string < string (replacing literal < literal) and add an
> >ORDER operator.
> >
> >
> >>>clarified the selection of operator definitions:
> >>>[[
> >>>When selecting the operator definition for a given set of parameters,
> >>>the definition with the most specific parameters applies. For
> >>>instance, when evaluating xs:integer = xs:signedInt, the definition
> >>>for = with two numeric parameters applies, rather than the one with
> >>>two RDF terms. The table is arranged so that upper-most viable
> >>>candiate is the most specific.
> >>>]]
> >>>
> >>>Added four Operator Table Entries like:
> >>>A < B		literal	literal	sop:literal-less-than(A, B) 
> >>>xsd:boolean
> >>>
> >>>Created sections for sop:literal-{less,greater}-than:
> >>>[[
> >>>11.2.3.0 sop:literal-less-than
> >>>
> >>>Returns TRUE if the first argument sorts earlier than the second
> >>>argument according to this earliest rule in the sorting rules:
> >
> >
> >>This seems circular - the ordering text defers to "<" where possible but 
> >>the "<" operator may defer to the sort order (which isn't a total 
> >>ordering anyway).
> >
> >
> >Ahh, "the sorting rules:" was intended to refer the the list that
> >immediately followed it (which you included in your quote), not to the
> >list in 10.1 Order.
> >If that were clear, would this circularity go away?
> 
> I think there should only one set of ordering rules.
> 
> I don't see how this (sop:literal-less-than) rules can be compatible with 
> the extensibility of a SPARQL processor to new datatypes.  That seems to 
> require that "<" is not defined for things unless the processor positively 
> knows their values have that relationship.  Rule 3 in uses fn:compare 
> fn:compare works on strings only so what happnes about typed literals?  
> Either it is forcing things even when the processor knows somethign about 
> datatype romanNumeral, or it only applies to strings.

Moot now, but ... rule 3 was speficially invoked on the lexical value
of the literal and the datatype URI.

> 
> >
> >>> 1. Numerics sort before xs:dateTimes.
> >>> 2. xs:dateTimes sort before typed literals that are neither numeric
> >>> nor an xs:dateTime.
> >>> 3. The remaining datatypes are compared with fn:compare. If the
> >>> result is -1, the first argument sorts before the second
> >>> argument. If the result is 1, the second argument sorts before the
> >>> first argument. If the result is 0, the order is determined by the
> >>> order of the lexical value of the datatypes. If fn:compare returns
> >>> -1, the first argument sorts before the second argument. If it
> >>> returns 1, the second argument sorts before the first argument. If
> >>> it returns 0, the two arguments are equivelent.
> >>>
> >>>11.2.3.0.5 sop:literal-greater-than
> >>>
> >>>Returns FALSE if the first argument sorts earlier than the second
> >>>argument according to this earliest rule in the above sorting rules.
> >>>]]
> >>>
> >>>and simplified sop:RDFterm-equal
> >>>[[
> >>>Returns TRUE if the two arguments are the same RDF term.
> >>>]]
> >>>
> >>>I submit this to Andy's (and anyone else's) review.
> >>
> >>	Andy
> >>
> >
> >
> >
> >------------------------------------------------------------------------
> >
> >mysql> SELECT IF (1 < 'a', 'T', 'F');
> >+------------------------+
> >| F                      |
> >+------------------------+
> >mysql> SELECT IF (1 > 'a', 'T', 'F');
> >+------------------------+
> >| T                      |
> >+------------------------+
> >mysql> SELECT IF (1 > NOW(), 'T', 'F');
> >+--------------------------+
> >| F                        |
> >+--------------------------+
> >mysql> SELECT IF (1 < NOW(), 'T', 'F');
> >+--------------------------+
> >| T                        |
> >+--------------------------+
> >mysql> SELECT IF ('a' > NOW(), 'T', 'F');
> >+----------------------------+
> >| T                          |
> >+----------------------------+
> >mysql> SELECT IF ('a' < NOW(), 'T', 'F');
> >+----------------------------+
> >| F                          |
> >+----------------------------+
> >
> >mysql> CREATE TABLE t (i INTEGER, c CHAR(1), d DATETIME);
> >mysql> INSERT INTO t (i,c,d) VALUES (1, NULL, NULL);
> >mysql> INSERT INTO t (i,c,d) VALUES (NULL, 'a', NULL);
> >mysql> INSERT INTO t (i,c,d) VALUES (NULL, NULL, NOW());
> >mysql> SELECT i,c,d FROM t ORDER BY IF(i IS NULL, IF(c IS NULL, d, c), i);
> >+------+------+---------------------+
> >| i    | c    | d                   |
> >+------+------+---------------------+
> >|    1 | NULL | NULL                |
> >| NULL | NULL | 2005-09-27 18:59:10 |
> >| NULL | a    | NULL                |
> >+------+------+---------------------+

-- 
-eric

office: +81.466.49.1170 W3C, Keio Research Institute at SFC,
                        Shonan Fujisawa Campus, Keio University,
                        5322 Endo, Fujisawa, Kanagawa 252-8520
                        JAPAN
        +1.617.258.5741 NE43-344, MIT, Cambridge, MA 02144 USA
cell:   +81.90.6533.3882

(eric@w3.org)
Feel free to forward this message to any list for any purpose other than
email address distribution.
Received on Tuesday, 27 September 2005 12:14:34 UTC