Re: Interpretation of RDF reification

Dan Brickley wrote:
> Let's ask it if the resource <registrar-1.rdf> is the dc:source of an
> rdf:Statement
> that has a predicate 'wife', subject
> <tag:danbri.org:2006:people:charlie> and
> object <tag:danbri.org:2006:people:alice>:

To recap, the moral of Dan's story is that RDF-reification doesn't track
the actual URI someone used in a document, and this leads to
non-intuitive semantics in this case:

_:s1 rdf:type rdf:Statement .
_:s1 rdf:subject ex:bob .
_:s1 dc:source <registrar-1.rdf> .

ex:bob owl:sameAs ex:charlie .

ASK: { ?s rdf:subject ex:charlie  .
       ?s dc:source <registrar-1.rdf> . }
--> YES

Two minor points to add to Dan.  First, as Dan points out,
RDF-reification isn't appropriate for tracking the actual URIs people
are using in documents.  The reason for this, though, is that in RDF,
URIs aren't things we can refer *to*.  There's no way to assert
something about a particular URI itself, i.e. that *that* URI was used
in some document, unless you create a new vocabulary.  (That makes a lot
of sense, really, because how would you ever know if a URI was there to
mean the denoted entity, or the URI itself?)

The second point is that while the SPARQL query might be unintuitive, we
actually have the same problem in English.  In semantics this is called
the "de re"/"de dicto" distinction.  To roughly translate the SPARQL
query into English we get:

Q:  "Does <registrar-1.rdf> refer to a man named Charlie?"

This question is ambiguous and would be true in both of these situations:

1)  I know someone named Bob Smith.  I have a document <registrar-1.rdf>
which mistakenly thinks Bob's name is Charlie.   About Bob
<registrar-1.rdf> says "Charlie is nice." (de dicto)

2)  I know someone named Charlie Smith.  About him, <registrar-1.rdf>
says "Mr. Smith is nice." (de re)

In the first case, <registrar-1.rdf> refers to a man using the name
Charlie even though that's not his real name.  (But the answer to the
English question about could still be 'yes'.)  This is the 'desired' RDF
interpretation in Dan's use case.

In the second case, <registrar-1.rdf> refers to a man, who is named
Charlie, but without using that name.  (And you could still answer 'yes'
to the English question.)  This is actually what RDF defines, and in
this light a 'yes' answer to the SPARQL query also makes a lot of sense.
 So while in English both the de dicto and de re readings are available,
in RDF you only have de re interpretations of URIs.

-- 
- Joshua Tauberer

http://taubz.for.net

"Unfortunately, we're having this discussion. It's too bad,
because guess who listens to the discussion: the enemy."


Dan Brickley wrote:
> * Lars Marius Garshol <larsga@ontopia.net> [2006-03-22 21:37+0100]
>>
>> I've been trying to read the answer to this answer out of the RDF  
>> specs, and I think I've got it, but would like to make 100% certain.
>>
>> If I create an RDF node that reifies the statement
>>
>>   (winston, married-to, clementine)
>>
>> what does that node represent? Specifically, does it represent the  
>> *statement* that these two are married, or does it represent the  
>> *marriage* relationship between them? That is, if the reifying RDF  
> 
> OK, going back to the start of this thread, and picking up a theme 
> live in various blog posts on planetrdf.com lately, ... I tried 
> working thru an example. Afraid it's not at the stage where I've tested
> it with tools yet, but might be useful. Pasting it in here (below) for now, 
> will blog when the examples are more machine-readable. --Dan
> 
> 
> 
> OK, let me try sketching some test cases around reification, which could
> (sorry, not there yet - help
> welcomed) be plugged into OWL reasoners and SPARQL query engines. Forget
> superman; our scenario is 
> more worldy. We are web detectives, on the trail of a would-be bigamist,
> whose multiple identifiers 
> aren't all familiar to the registrars who have been busy marrying him
> off. Imagine that the registrars
> publish their official records in RDF, and that we're consuming those
> records, alongside some
> other trusted evidence, with an OWL-aware software system. Further
> imagine that we export some
> hopefully-useful RDF from the OWL system and query it using SPARQL, with
> the intent of asking questions
> like "which registrar said what?". A lot of folks try to use RDF
> reification in such scenarios;
> I'm not convinced it works. 
> 
> registrar-1.rdf:
> 
>  <tag:danbri.org:2006:people:bob> <http://example.org/family#wife>
> <tag:example.org:2005:people:alice> .
> 
> # the resource called <tag:danbri.org:2006:people:bob> has a 'wife' that
> # is the resource 
> # called <tag:example.org:2005:people:alice>
> 
> 
> registrar-2.rdf:
> 
>  <tag:danbri.org:2006:people:charlie> <http://example.org/family#wife>
> <tag:example.org:2005:people:mary> .
> 
> # the resource called <tag:danbri.org:2006:people:charlie> has a 'wife'
> # that is the resource 
> # called <tag:example.org:2005:people:mary>
> 
> 
> nndb-example-bio.rdf:
>  <tag:danbri.org:2006:people:charlie>
> <http://www.w3.org/@something/.../owl#sameAs>
> <tag:danbri.org:2006:people:bob> 
> 
> # <tag:danbri.org:2006:people:charlie> and
> # <tag:danbri.org:2006:people:bob> are URI names for the same resource
> 
> 
> 
> who-said-what.rdf:
> # trying to keep track of these different claims using RDF reification
> # vocab. 
> # (this is the thing I don't think does what people hope it does...)
> 
>  _:s1 rdf:type rdf:Statement .
>  _:s1 rdf:predicate <http://example.org/family#wife> . 
>  _:s1 rdf:subject <tag:danbri.org:2006:people:bob> .
>  _:s1 rdf:object <tag:danbri.org:2006:people:alice> .
>  _:s1 <http://purl.org/dc/elements/1.1/source> <registrar-1.rdf> .
> 
>  _:s2 rdf:type rdf:Statement .
>  _:s2 rdf:predicate <http://example.org/family#wife> . 
>  _:s2 rdf:subject <tag:danbri.org:2006:people:charlie> .
>  _:s2 rdf:object <tag:danbri.org:2006:people:mary> .
>  _:s2 <http://purl.org/dc/elements/1.1/source> <registrar-2.rdf> .
> 
> So, at face value, who-said-what.rdf captures an RDF description 
> of the claims in both registrar-1.rdf and registrar-2.rdf, and
> associates them with simple provenance information - in this case, 
> by identifying a "dc:source" document, associated with 
> some described RDF statement.
> 
> However, what happens if we believe the (perfectly reasonable) 
> document, nndb-example.bio.rdf, which tells us that two 
> URIs denote the same resource? ie. that the thing called  
> <tag:danbri.org:2006:people:charlie> is the owl:sameAs thing 
> as that called <tag:danbri.org:2006:people:bob>.
> 
> My understanding (sorry I can't quote chapter-and-verse here) is that 
> 
>  _:s1 rdf:subject <tag:danbri.org:2006:people:charlie> .
> combined with 
>  <tag:danbri.org:2006:people:charlie>
> <http://www.w3.org/@something/.../owl#sameAs>
> <tag:danbri.org:2006:people:bob> 
> gives us an extra triple,
>  _:s1 rdf:subject <tag:danbri.org:2006:people:charle> .
> 
> ...since the two URIs are names for the same thing, there is nothing
> true of 
> the thing called <tag:danbri.org:2006:people:charlie>  that is not also
> true of the
> thing called  <tag:danbri.org:2006:people:bob>. Similarly, we should get
> another 
> extra triple, 
>  _:s2 rdf:subject <tag:danbri.org:2006:people:bob> .
> 
> 
> At this point, if who-said-what.rdf and nndb-example-bio.rdf are 
> considered true descriptions, and we honour OWL's built-in semantics 
> for owl:SameAs, we end up with an expanded bunch of triples
> that use RDF reification vocabulary:
> 
> (please correct me if this is wrong - though i can't see how it could
> be!)
> 
>  _:s1 rdf:type rdf:Statement .
>  _:s1 rdf:predicate <http://example.org/family#wife> . 
>  _:s1 rdf:subject <tag:danbri.org:2006:people:bob> .
>  _:s1 rdf:subject <tag:danbri.org:2006:people:charlie> .
>  _:s1 rdf:object <tag:danbri.org:2006:people:alice> .
>  _:s1 <http://purl.org/dc/elements/1.1/source> <registrar-1.rdf> .
> 
>  _:s2 rdf:type rdf:Statement .
>  _:s2 rdf:predicate <http://example.org/family#wife> . 
>  _:s2 rdf:subject <tag:danbri.org:2006:people:charlie> .
>  _:s2 rdf:subject <tag:danbri.org:2006:people:bob> .
>  _:s2 rdf:object <tag:danbri.org:2006:people:alice> .
>  _:s2 <http://purl.org/dc/elements/1.1/source> <registrar-2.rdf> .
> 
> 
> So, loading up who-said-what.rdf (ostensibly, a useful file giving a
> skeptical account of
> which RDF documents made which claims), alongside nndb-example-bio.rdf
> (another useful file,
> documenting some cases in which there are multiple URI names for the
> same thing), we 
> get a description that can be queried with SPARQL.
> 
>  _:s1 rdf:type rdf:Statement .
>  _:s1 rdf:predicate <http://example.org/family#wife> . 
>  _:s1 rdf:subject <tag:danbri.org:2006:people:bob> .
>  _:s1 rdf:subject <tag:danbri.org:2006:people:charlie> .
>  _:s1 rdf:object <tag:danbri.org:2006:people:mary> .
>  _:s1 <http://purl.org/dc/elements/1.1/source> <registrar-1.rdf> .
> 
> Let's ask it if the resource <registrar-1.rdf> is the dc:source of an
> rdf:Statement
> that has a predicate 'wife', subject
> <tag:danbri.org:2006:people:charlie> and
> object <tag:danbri.org:2006:people:alice>:
> 
> (see
> http://www.w3.org/TR/2006/WD-rdf-sparql-query-20060220/#queryReification
> btw)
> 
> 
> query1.rq:
> 
> 	PREFIX rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
> 	PREFIX dc:   <http://purl.org/dc/elements/1.1/>
> 
> 	ASK
> 	{ ?s rdf:subject    <tag:danbri.org:2006:people:charlie>  .
> 	  ?s rdf:predicate  <http://example.org/family#wife>  .
> 	  ?s rdf:object     <tag:danbri.org:2006:people:alice> .
> 	  ?s dc:source     <registrar-1.rdf> .
>         }
> 
> 
> My understanding is that we'd get a 'yes' back from this query, but that
> lots of folk would expect to get a 'no', since they read this 
> as "DOes <registrar-1.rdf> contain the charlie/wife/alice triple?".
> 
> The RDF Semantics spec does contain some warning of this,
> http://www.w3.org/TR/rdf-mt/#Reif 
> [[
> Note that this way of understanding the reification vocabulary does not 
> interpret reification as a form of quotation. Rather, the reification 
> describes the relationship between a token of a triple and the resources 
> that triple refers to. The reification can be read intuitively as saying 
> "'this piece of RDF talks about these things" rather than "this piece 
> of RDF has this form".
> ]]
> 
> ...ie., in our scenario, it is true that registrar-1.rdf *does* talk
> about
> the thing that has a URI name <tag:danbri.org:2006:people:charlie>, even
> though that URI doesn't itself appear anywhre in the registrar-1.rdf
> graph.
> 
> Combined with the RDFCore decision on statings vs statements which
> allows
> distinct different statements to share the same predicate, subject and 
> object, RDF developers may be tempted to use RDF's reification
> vocabulary
> to keep track of "who said what". However, such descriptions interact in
> unfortunate ways with core RDF and OWL facilities, and can give 
> counter-intuitive resources. 
> 
> (Note that doing all this in pure RDF, we have no problem; owl:sameAs is
> just another triple, to an RDF triplestore. It's only when the 
> OWL meaning of owl:sameAs kicks in, do we get to these issues. But RDF
> and
> OWL systems live in the same Web; documents published from an RDF-only
> shop
> may be consumed, interpreted, queried etc. by OWL systems and the
> results
> re-published on the Web as plain RDF...)
> 
> My preference is simply to never use the W3C RDF reification vocab, and
> to 
> use other mechanisms for keeping track of 'who said what'....
> 
> Aside: note also that in the openworld, nobody has assured us that 
> <tag:danbri.org:2006:people:alice> and <tag:danbri.org:2006:people:mary>
> are
> different individuals. Also that this would be a lot more complicated to 
> think about if we were using bnodes and reference-by description instead
> of 
> simple URI identifiers for people.
>  
> 
> 
> 
> 
> 
> 

Received on Thursday, 23 March 2006 21:13:59 UTC