Re: Datatyping Summary from Patrick Stickler on 2002-01-29 (w3c-rdfcore-wg@w3.org from January 2002)

From: Patrick Stickler <patrick.stickler@nokia.com>
Date: Tue, 29 Jan 2002 11:19:50 +0200
To: Dan Connolly <connolly@w3.org>, Brian McBride <bwm@hplb.hpl.hp.com>
CC: RDF Core <w3c-rdfcore-wg@w3.org>, Jeremy Carroll <jjc@hplb.hpl.hp.com>, Sergey Melnik <melnik@db.stanford.edu>
Message-ID: <B87C35D6.C8DE%patrick.stickler@nokia.com>

On 2002-01-29 0:17, "ext Dan Connolly" <connolly@w3.org> wrote:

>> This is similar to B2.  I've changed the example slightly from Sergey's.
>> Consider the graph:
>> 
>>    _:f <rdf:type> <film> .
>>    _:f <dc:Title> "10" .
>>    <mary> <age> "10" .
>> 
>> Given a query:
>> 
>>    (?x <dc:Title> ?y) & (?z <age> ?y)
>> 
>> existing applications will return:
>> 
>>    ?x = _:f, ?y = "10", ?z = <mary>
>> 
>> Under TDL, they would return null.

(apologies in advance for the lengthy reply, please
 bear with me... I've made it as terse as I dare)

I believe that Jeremy's recent 1984 example (in
addition to other examples provided over the past
few days) clearly demonstrates that a literal
does not have consistent global meaning. Whether
the meaning of a literal is expressed in RDF or
externally by application is beside the point.

A literal *can* mean different things in different
contexts.

Thus, a query that is based solely on a literal
value, with disregard for its datatype context(s)
is a query only on the syntax of the graph, not
on the knowledge expressed by the graph.

Given that, it depends on whether the implementation is
doing string comparison of literals or node comparison,
whether a TDL implementation would return null or not.

If the query engine is simply doing string comparison
of the literal labels (which I think is a reasonable
expectation) then it should return

     ?x = _:f, ?y = "10", ?z = <mary>

This doesn't mean that the literal "10" assigned to
?y denotes the same "thing", only that it is the
same literal, the string equal label of two nodes
in the RDF graph, i.e there are two statements which
have an intersection of the same literal label, that's
all.

What thing it denotes depends on its interpretation,
and that interpretation depends on a datatype context.

To use a query over graph syntax reliably, you must
differentiate between the different interpretations
of the literals by context.

One way to do that is to view a logical layer above
the graph which associates the datatype context with
each literal, i.e. it provides the TDL pairings; e.g.

     _:f <rdf:type> <film> .
     _:f <dc:Title> ("10", xsd:string) .
     <mary> <age> ("10", xsd:integer) .

Now, given a query:

    (?x <dc:Title> ?y) & (?z <age> ?y)

every application should return null because the value
denoted by ("10", xsd:string) is not the same value
denoted by ("10", xsd:integer).

Thus in conclusion (and you thought I would never get
there ;-) an application which equates the two occurrences
of "10" as the same value and successfully returns a
result for the specifiec query is in fact broken, as it
misses the point that two different values are denoted.

Patrick

--

Patrick Stickler              Phone: +358 50 483 9453
Senior Research Scientist     Fax:   +358 7180 35409
Nokia Research Center         Email: patrick.stickler@nokia.com

Received on Tuesday, 29 January 2002 04:19:04 UTC