W3C home > Mailing lists > Public > public-rdf-dawg@w3.org > July to September 2006

Re: my action item

From: Pat Hayes <phayes@ihmc.us>
Date: Fri, 4 Aug 2006 08:47:39 -0700
Message-Id: <p06230906c0f91c56265d@[192.168.1.6]>
To: andy.seaborne@hp.com
Cc: RDF Data Access Working Group <public-rdf-dawg@w3.org>, Eric Prud'hommeaux <eric@w3.org>

>How about a scheme like this for comparison of literals:

That all reads fine to me. Nice summary of what is needed.

Pat

>
>1/ Be explicit about value spaces; the design is comparison by-value.
>
>All operators return true if the implementation positively knows 
>that the two values compare as needed, return false if the 
>implementation positively knows that the two value do not compare as 
>needed and returns error if it does not know.
>
>http://www.w3.org/TR/xmlschema-2/#value-space
>
>2/ Define sop:value-compare(A, B) to be -1, 0 , 1 or error depending 
>on whether A and B are less than, equal, greater than, or it's an 
>unknown comparison.
>
>Note that sop:value-compare can be partial.  A processors always 
>knows A = B without much else if the lexical forms and datatypes 
>match.
>
>3/ Define =, !=, <, <= , > , >= to be the relevant result(s) of value-compare
>
>4/ State which datatypes that are required for a SPARQL engine (this 
>could even be less than the current set; xsd:int but not arbitrary 
>length integers;  no decimals, or no dateTime which are a bit larger 
>in implementation costs).
>
>5/ Show that value-compare maps to the "XPath Tests" table for the 
>operators where an implementation provides them.
>
>6/ = and != can be defined on non-literals be RDFterm-equals as currently.
>
>In terms of text change and test change and implementation impact, 
>this is actually quite a small change because it exactly agrees on 
>the fixed set of datatypes we already have.  It just permits 
>extensibility through the principle is value testing.
>
>An implementation can provide more datatypes as it chooses, meeting 
>the "Extensible Value Testing".  It is explicitly monotonic in the 
>capabilities of the processor.  But now legacy or other standards 
>for datatypes can be added smoothly (e.g. ISO 8601 date and time 
>which is not exactly the same as XSD dateTime).
>
>	Andy
>
>
>
>Pat Hayes wrote:
>>>On Tue, Aug 01, 2006 at 11:19:45AM -0700, Pat Hayes wrote:
>>>>  Re. my action item from today's telecon.
>>>>
>>>>  After looking at Andy's examples in
>>>>  http://lists.w3.org/Archives/Public/public-rdf-dawg/2006AprJun/0104.html
>>>>  more closely, his example 6 seems to behave correctly for the issue
>>>>  that you were raising, if I understand it properly. In which case no
>>>>  further examples are needed, and my action item is moot.
>>>>
>>>>  So let me see if I have got this right.
>>>>
>>>>  My understanding of your concern was that we had a nonmonotonic
>>>>  situation because a not-equal ( !=) filter, as in example 6, behaved
>>>>  as follows: when faced with an unknown datatype, it would revert to a
>>>>  string-not-equal test on the literal string, and so succeed when the
>>>>  literal strings were distinct but the type URI matches; and then this
>>>>  success might transform to a failure when better datatyping
>>>>  information is available.
>>>Our measure of monotinicity is that adding knowledge to the system
>>>does not cause us to rescind conclusions. We should never get answers
>>>from the naive implementation that we don't get from the omniscient
>>>one (adding support for a datatype should not cause us to rescind
>>>answers).
>>
>>Agreed.
>>
>>>  The current text in rq2{3,4} has:
>>>
>>>[[
>>>When selecting the operator definition for a given set of parameters,
>>>the definition with the most specific parameters applies. For
>>>instance, when evaluating xsd:integer = xsd:signedInt, the definition
>>>for = with two numeric parameters applies, rather than the one with
>>>two RDF terms. The table is arranged so that upper-most viable
>>>candiate is the most specific.
>>>...
>>>A != B	numeric	      numeric	    fn:not(op:numeric-equal(A, B))
>>>A != B	xsd:boolean   xsd:boolean   fn:not(op:boolean-equal(A, B))
>>>A != B	xsd:dateTime  xsd:dateTime  fn:not(op:dateTime-equal(A, B))
>>>...
>>>A != B	RDF term      RDF term	    fn:not(RDFterm-equal(A, B))
>>>
>>>The naive implementation sees
>>>   "2"^^xsd:integer != "II"^^roman:numeral
>>>and says "are they both numerics? no, boolean? no ... RDF terms? yes"
>>>and does the RDFterm-equal test. They are not the same term so the
>>>answer is TRUE (remember, *not* equal).
>>
>>OK, I agree this is broken as written, but then this also seems to 
>>be at odds with test 6 in that test suite. So I guess my point is, 
>>regardless of what the spec currently says, those tests illustrate 
>>what the right behavior OUGHT to be, which would be that a != 
>>between two literals with unknown datatypes is simply unknown, and 
>>can never succeed, regardless of the RDF term equality result 
>>between them. So, reverting now to my very limited action item, I 
>>don't need to tweak those tests or add to them in order to show 
>>what the result SHOULD be. Right?
>>
>>>Some wise-guy adds support for roman:numeral to make the omniscient
>>>implementation from the following schema (note: restriction of decimal):
>>>
>>>   <xs:simpleType name="numeral" id="numeral">
>>>     <xs:restriction base="xs:decimal">
>>>       <xs:fractionDigits fixed="true" value="0" 
>>>id="romanNumeral.fractionDigits"/>
>>>       <xs:pattern value="[IVDXLC]+"/>
>>>       <xs:minInclusive value="0" id="romanNumeral.minInclusive"/>
>>>     </xs:restriction>
>>>   </xs:simpleType>
>>>
>>>Now the implementation says "are they both decimals? yep" and returns
>>>FALSE (II is *not* != 2), causing us to lose an answer that we had in
>>>the naive implementation.
>>>
>>>>  But this is not what the test examples indicate. With this rule, in
>>>>  case #6, it would give the answer binding [ x/x1, v/"b"^^t:type1 ],
>>>>  but in fact it does not: it gives no answers, as it should in order
>>>>  to be monotonic when more datatype information is available. And the
>>>>  comment on text 6 seems to  indicate that 'no result' is determined
>>>>  in this case for reasons of preserving monotonicity, and works
>>>>  symmetrically for equality and not-equality.
>>>I believe that this test does illustrate the problem. I can concoct a
>>>type system where the two are, in cleverer systems, known to be the
>>>same value.
>>
>>Right, and in that case - following now the behavior indicated by 
>>the example, not by the spec text you cite - the behavior will be 
>>indistinguishable from what it is now (no answers) but if you 
>>instead concoct a system in which they have different values, then 
>>the query will succeed. So either way, we get monotonic behavior. 
>>Again, note I am not following the first-line-in-table rule here, 
>>but the behavior as specified in the test suite email: they give 
>>different results on text 6.
>>
>>So, if we follow the rule as illustrated by test 6, which as I read 
>>the test is that when either of A or B is typed with an unknown 
>>datatype, then  A != B test always fails while A=B succeeds only 
>>when A and B are the exact same literal string and same datatype 
>>URI, then we don't need to do anything about extending the 
>>equality. Right?
>>
>>Pat
>>
>>>Therefor, we need to spell it
>>>
>>>SELECT *
>>>{ ?x :p ?v
>>>      FILTER ( ?v !sameLiteral "a"^^t:type1 )
>>>}
>>>
>>>or something like this.
>>>
>>>>  So, either the tests are OK, or I have misunderstood your point.
>>>>
>>>>  Eric? Or indeed, anyone with anything useful to say?
>>>>
>>>>  Pat
>>>>  --
>>>>  ---------------------------------------------------------------------
>>>>  IHMC		(850)434 8903 or (650)494 3973   home
>>>>  40 South Alcaniz St.	(850)202 4416   office
>>>>  Pensacola			(850)202 4440   fax
>>>>  FL 32502			(850)291 0667    cell
>>>>  phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
>>>>
>>>--
>>>-eric
>>>
>>>home-office: +1.617.395.1213 (usually 900-2300 CET)
>>>	    +33.1.45.35.62.14
>>>cell:       +33.6.73.84.87.26
>>>
>>>(eric@w3.org)
>>>Feel free to forward this message to any list for any purpose other than
>>>email address distribution.


-- 
---------------------------------------------------------------------
IHMC		(850)434 8903 or (650)494 3973   home
40 South Alcaniz St.	(850)202 4416   office
Pensacola			(850)202 4440   fax
FL 32502			(850)291 0667    cell
phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
Received on Friday, 4 August 2006 15:49:21 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 16:15:27 GMT