Re: adding dawg:monotonicity and extensible data types to SPARQL query

Eric Prud'hommeaux wrote:
> On Mon, Aug 14, 2006 at 01:08:03PM +0200, Eric Prud'hommeaux wrote:
> http://www.w3.org/2001/sw/DataAccess/rq23/rq24#tests v1.14 has a new
> draft of the Value Testing section. This does not include the
> extensible datatypes support (but certainly makes it easier to add).
> This version is intended to include only editorial changes from the CR
> version.
> 
>>    [DONE] ACTION: EricP to respond to PatH's new test with a proof of
>>    whether it's monotonic to extended datatype support [recorded in
>>    [25]http://www.w3.org/2006/08/08-dawg-minutes.html#action01]
> 
>>    <fred> literal = literal: true or error
>>
>>    <fred> iri = iri: true or false
>>
>>    <fred> bnode = bnode: true or false
>>
>>    <fred> allother cells always false
>>
>>    2=3
>>
>>    <AndyS> Yes, Fred - that's the table I was thing of.
> 
> In 1.14, I've updated RDFterm-equal to the following:
> 
> http://www.w3.org/2001/sw/DataAccess/rq23/rq24#func-RDFterm-equal
> [[
> Returns TRUE if term1 and term2 are the same RDF term as defined in
> Resource Description Framework (RDF): Concepts and Abstract Syntax
> [CONCEPTS]; produces a type error if the arguments are both literal
> but are not the same RDF term;

Isn't this a bit circular as to "same RDF term"?  Something about the equality 
of the three parts of lexical form, datatype and lang tag (for literals) etc etc.

> returns FALSE otherwise. term1 and
> term2 are the same if any of the following is true:
> 
>     * term1 and term2 are equivalent IRIs as defined in 6.4 RDF URI
>       References.
>     * term1 and term2 are equivalent literals as defined in 6.5.1
>       Literal Equality.
>     * term1 and term2 are the same blank node as described in 6.6
>       Blank Nodes.
> ]]
> 
> I added the "; produces a type error if the arguments are both literal
> but are not the same RDF term; returns FALSE otherwise" bit. The rest
> was already there.

Suggestion for a name for this : "unknown-equals" or "general-value-equals" 
and note that "=" may have been intercepted by a datatype specific definition 
of "=".

There should be text to give examples; and also for !=.

Let's reserve "term-equals" language for a syntactic test and not having it 
generate an error because "term equality" suggests syntax (to me at least) 
without regard to value.

An operator such as "sameTerm(?x, ?y)" would provide direct access to it (it's 
short hand for something like:

( isURI(?x) && isURI(?y) && str(?x) = str(?y) ||
( isBlank(?x) && isBlank(?y) && ... same labels .... ) ||
( isLiteral(?x) && isLiteral(?y) &&
   str(?x) = str(?y) &&
    (
      (lang(?x) = "" && lang(?y) = "" &&            # Same datatype, if any
         ( datatype(?x) = datatype(?y) || true )
    ||
    ( lang(?x) = lang(?y) )                         # Same lang, if any
    )
)

The literal part is complex (and probably not correct in the above) because of 
lang tags and datatypes (and its asymmetric in the treatment of no lang tag 
and no datatype).

There is no way to get the label of a bNode (which is OK).

I assume datatype("eric"@fr) is an error - I can't find anything in rq24

>>    <AndyS> bNode = literal (not bNode in query) may be valid
>>
>>    <AndyS> Separate sameLiteral operator.
>>
>>    <AndyS> if we want a syntactic comparision
>>
>>    <AndyS> "(x,y)"^^:geo
>>
>>    <AndyS> If you want help with this, do ask - I'm the one keen to have
>>    this extensibility so I feel responsible here.
>>
>>    <kendallclark> ACTION: EricP to redraft section 11 to support
>>    extensible datatypes [recorded in
>>    [18]http://www.w3.org/2006/08/08-dawg-minutes.html#action08]
> 
> To this end, I propose the following addendum to the derived types list:
> [[
> Extended SPARQL implementations may treat additional types as being
> derived from numeric types.
> ]]

There is no need to restrict things to numerics.  Any new value space is 
possible.  Examples:

1/ xsd:dates
2/ Things with units.
    For a sufficiently knowledgeable processor:
    "273"^^:kelvin should not compare with "273^^xsd:integer [*]
    "273"^^:kelvin should compare with "+273^^:kelvin
    "275"^^:kelvin should compare with "2^^:centigrade

[*] Let's not confuse record temperature as a number, and recording it as a 
unit datatype.  :kelvin(273) would be needed.

> 
> and a new minor section following the operator table:
> [[
> 11.3.1 Operator Extensibility
> 
> Extended SPARQL implementations may support additional associations
> between operators and operator functions; this amounts to adding rows
> to the table above. No additional operator support may yield a result
> that replaces any result other than a type error in an unextended
> implementation. The consequence of this rule is that extended SPARQL
> implementations will produce at least the same solutions as an
> unextended implementation, and may, for some queries, produce more
> solutions.
> ]]

The text "and may, for some queries, produce more solutions" won't be true 
because we have logical not.

> 
> I think this behaves exactly as sop:value-compare would.
> 
> 
> Cost:
> 
> Is the cost of using the same operator for value comparison and symbol
> comparison less than that of making users use a different operator for
> RDFterm-equal? I think it's a matter of taste. The wierd case in this
> solution is that you can't negate a syntactic literal equivilence
> test.

This isn't symbol comparison any more because the backstop "=" does not work 
on all symbol combinations (unknown datatypes, different lexical forms).

> 
> Data:
>   <x> <p> "II"^^roman:numeral .
> 
> Query1:
>   ASK { ?x ?p ?v
>         FILTER (?v = "IV"^^roman:numeral) }
> Result1: no
> 
> Query1:
>   ASK { ?x ?p ?v
>         FILTER (?v != "IV"^^roman:numeral) }
> Result1: no
> 
> Of course, and extended SPARQL implementation may give you a yes for
> the latter but the issue that will make users cock their heads shows
> up in the unextended implementation.

That's inevitable with monotonicity + extensible datatypes + ASK masking error 
vs false.  And that's OK.

> 

I still think explicitly talking about value spaces (a paragraph) will make it 
clearer.  Then say "=" etc works on same-value space pairs.

If you want, I'll write this text.

	Andy

Received on Monday, 21 August 2006 14:08:12 UTC