Re: semantics of dataset predicates from Pat Hayes on 2012-09-26 (public-rdf-wg@w3.org from September 2012)

From: Pat Hayes <phayes@ihmc.us>
Date: Wed, 26 Sep 2012 17:26:37 -0500
To: Sandro Hawke <sandro@w3.org>
Cc: W3C RDF WG <public-rdf-wg@w3.org>
Message-Id: <7FD1C418-FE9E-4CF2-8EC6-CC7DA3001A94@ihmc.us>
On Sep 26, 2012, at 7:37 AM, Sandro Hawke wrote:

> On 09/26/2012 12:59 AM, Pat Hayes wrote:
>> On Sep 25, 2012, at 9:25 PM, Sandro Hawke wrote:
>> 
>> 
>>> On 09/25/2012 09:54 PM, Pat Hayes wrote:
>>> 
>>>> On Sep 25, 2012, at 5:16 PM, Sandro Hawke wrote:
>>>> 
>>>> 
>>>> 
>>>>> For myself, at this point I'm 70% convinced that I can implement all the dataset use cases I understand (the ones I enumerated in the Federated Phonebook examples, plus SPARQL dump/restore) without any standard dataset semantics beyond having a standard place for metadata (eg the default graph in trig and the service description graph in SPARQL).
>>>>> 
>>>>> 
>>>> Sandro, how can you use metadata *at all* without some way to force a URI to denote a graph? When you use the URI in the metadata RDF, what (semantic or even pragmatic) constraint ensures that what it denotes there is the graph that you have in mind? Or indeed, that it is a graph at all?
>>>> 
>>>> 
>>> My current theory is that I can do it via the documentation of the predicate(s) I use with that URI.  
>>> 
>> Hmm. But the semantics being discussed right now explicitly distinguishes the graph "named" by the URI it is attached to in a dataset, from the entity the URI is understood to denote. And the RDF semantics says that when a URI is used in an RDF triple, it refers to what it denotes. So your documentation seems, on the face of things, to be at odds with what the semantics says.
>> 
> 
> On the face of it, perhaps, because of what you're reading into the informal language I used.  Let me see if I can be somewhat more formal without stepping in over my head:
> 
> X eg:sendCorrectionsTo Y 
> 
> is defined by:

Whoa. RDF already says how this needs to be interpreted. For some interpretation I (which you are welcome to define), this is true when the pair <I(X), I(Y)> is in the relational extension of I(eg:sendCorrectionsTo). So, you need to tell us what these three things, I(X), I(Y) and the relational interptetation for sendCorrectionsTo, in your intended interpretations.  You don't get to just make up some arbitrary and complicated truth condition which works in a special way for your triples (if you want to call it RDF, anyway :-)

> 
> In an RDF Dataset <DG, <U0,G0>, <U1, G1>...> if an RDF triple in DG matches the graph pattern {?X eg:sendCorrectionsTo ?Y} and ?X = Ui for some i, then ?Y is a socially appropriate email address to which one may send corrections to any information expressed in Gi.

Nah. The key point is, that X in your triple refers to something I(X), and the current candidate for a dataset semantics says *explicitly* that this I(X) need *not* be the graph in that dataset. And there are dataset users out there who definitely want to have datasets where the Uis in the pairs denote something that is not a graph, such as a provenance source or a person or a time-interval. Being a "graph name" does not imply that it is the RDF name of a graph. 

> My hope is that RDF Concepts 1.1 will define terminology such that I can instead say that in a much more readable form, such as:
> 
> { X eg:sendCorrectionsTo Y } means that Y is a good email address for sending corrections to the information in the named graph X.

Which is exactly what you can't do, as things stand now. 

> 
> I know this definition sort of flouts the RDF Semantics, because it defines eg:sendCorrectionsTo in terms of the URI X, itself, instead of I(X).

No "sort of" about it, this is sharply inconsistent with the 2004 normative RDF specs. We can alter the specs to make this make sense, or we can not do this. But it seems to me that we have to do one or the other. We can't coherently say that RDF semantics works one way and then normatively define part of it to work some other way. 

>  But it seems to be perfectly workable, and I think also has a coherent reading in line with the RDF Semantics and some Semantics of Datasets -- even if it's not one this group can see/agree on.

I dont think it has a coherent reading with the 2004 semantics. We can tweak those semantics so that it is coherent, but only by introducing some way to allow URIs to change their referents with different occurences of use, which AFAIK requires some kind of context idea in the basic logic. I'm not saying this is hard to do (it's not) only that it needs to be actually be done, if we are going to do it. We can't pretend to do it without actually doing it. 

> 
> In particular, I'm perfectly comfortable with inference being done with eg:sendCorrectionsTo.   In my head, when I construct various OWL constructs using X, Y, or eg:sendCorrectionsTo, and do OWL inference, I haven't noticed any counter-intuitive results.

Because, I suggest, you have a coherent picture in your head in which "graph names" really do name graphs. I wish we could all agree on this, but apparently we can't. Try your intuition on your examples while chanting to yourself, X does *not* mean a graph, X does *not* mean a graph, ...  Or try putting your RDF together with some other RDF which uses the same X to refer to a person, and see what you can derive in OWL. For a start, what is the owl:domain of eg:sendCorrectionsTo? 

> 
>> The basic point is, you can specify what your property means, but you don't get free rein to say what the graph "name" URI refers to. (At least, not unless you have something like a context to put it in, where you *could* say what it means.) 
>> 
> 
> Yeah, formally, for now, I'm ignoring what the URI X might refer to, and just using X itself.

But this contradicts what the 2004 semantics says. You can define a different interpretation in which X refers to what you want it to refer to, but then in that interpretation it doesn't refer to what the dataset writiers had in mind. And (given a few plausible axioms), no interpretation is going to do what you want and also what they want at the same time, so your view is inconsistent with their view, and this inconsistency is going to be derivable as an OWL contradiction from those plausible axioms. And, just to rub in the salt, this is not a bug or a problem: this is what RDF and OWL are built for, to be able to detect contradictions like this, because they reveal cognitive errors in writing coherent ontological content, and signal valid entailments. Saying that something is both a graph and a person is the bug; not being able to say it consistently is a *feature* of RDF/OWL. 

> I know that's not Kosher, but I'm at a loss to find an actual problem with it.   Model theory is one way to describe coherent system behavior

Model theory describes truth conditions. Truth conditions define entailment and consistency relationships between graphs (and sentences more generally) , which can then be used to help specify system behavior. But there is no direct single step from model theory to behavior.

Your logic here has the form: I want to be able to say nonsense. I can't see any problem with nonsense. This system doesn't let me say nonsense, so I need to change the system. Bad idea, I suggest. 

> , but if I can express that coherent system behavior in other ways, that's okay too.

Well, great. Please, do that. Give us a non-model-theory semantics for RDF. Seriously, I'd love to see how you would begin to tackle this. If it makes sense, I will work with you on trying to make it into a full RDF semantics. 

>   Hopefully we'll converge back on model theory, when you see how to use it to explain what I'm trying to do.
> 
> Let's see how far I can go with this.  

Well, can you first tell us what "this" is, more or less? What are the ground rules for whatever it is you are designing here?

> We'll start with another predicate:
> 
> { ?X eg:userProfile ?Y } means that ?X is a foaf:Person and the triples of their user profile are in named graph ?Y
> which seems fine to me, then let's twist that into the admittedly-crazy usage we sometimes use as an example, where a foaf:Person URI is used as the graph-label in a dataset for the their profile info.   So, we get something like this:
> { ?X rdf:type eg:PersonAndProfileLabel } means that ?X denotes a foaf:Person and the triples in that person's user profile are in the named graph ?X.

No, it has to mean that the person is in some class or other (because that's how rdf:type is defined.). I guess it could be the class of people who are the name of the named graph containing their own user profile, but this seems like a rather odd class. And in any case, that doesn't refer to the particular name ?X, so maybe this guy has another name ?Z owl:sameAs ?X and his user profile is paired with ?Z rather than with ?X. 

There is no way to "hold on" to the particular name in the semantic recursions:  once you interpret a name, you are working with whatever the intepretation says the name denotes, and the name you used to get there is gone. 

Maybe if you had some way to actually refer to that graph other than by using the name ?X, we could make better sense of this.  Since graph labels can mean anything, the pairing <N, G> in a dataset actually asserts a relation between N and G, lets call this relation :graphLabelOf. Then your condition is that the person's user profile is in a graph _:x, where ?X :graphLabelOf _:x, right? OK, that makes sense. The right way to render this in RDF is

?X :graphLabelOf _:x
_:x eg:userProfile ?Y

but this still does not rely on the particular name you use to refer to ?X with. For example, if ?X owl:sameAs ?Z, then this is equivalent to 

?Z :graphLabelOf _:x
_:x eg:userProfile ?Y

but that first triple could well be false, right? 

> Speaking as a software engineer, I think that probably works.  I'm pretty sure coherent code can be written to work with that class, even in the presence of various kinds of reasoning.

Maybe. For myself, I would have no idea how to tell if the code was working or not. I can't understand its specification.

Pat


>   I'd need a day or two of coding to get that to "quite sure", and I'd probably do a few tweaks along the way. 
> 
>        -- Sandro
> 
>> Pat
>> 
>> 
>> 
>>>  Given the emerging sense of what datasets are, I think I can write that document in such a way that it will be quite clear to human readers (and thus the software they write) how it connects to the triples in the named graphs.
>>> 
>>> I'm hoping the WG will end up making it really easy to write that documentation, but I'm pretty sure it's possible anyway.
>>> 
>>> The example I used in [1] was:
>>> 
>>>     <g1> eg:sendCorrectionsTo <
>>> mailto:sandro@w3.org
>>>> .
>>>> 
>>>     <g1> { w3c:group35462 rdfs:label "SPARQL Working Group" }.
>>>     <g2> eg:sendCorrectionsTo <mailto:
>>> 
>>> ivan@w3.org
>>>> .
>>>> 
>>>     <g2> { w3c:group44350 rdfs:label "RDFa Working Group" }.
>>> 
>>> (I'm using the default graph for metadata, and leaving off the braces around default-graph triples)
>>> 
>>> And then I proposed documenting eg:sendCorrectionsTo something like this:
>>> 
>>>     X eg:sendCorrectionsTo Y
>>> 
>>>         Note: only meaningful as metadata for a dataset which has a named graph
>>>         with the name X.
>>> 
>>>         Meaning: Y is a good email address for sending corrections to
>>>         the information in the named graph X.
>>> 
>>> 
>>> This definition doesn't really care whether named graphs are g-boxes or g-snaps, but I think it's probably good enough to work in practice.   Other predicates might be much more precise, of course.   
>>> 
>>> I'm not thrilled with the predicate being only meaningful when used in a dataset metadata [I prefered my framing as rdf-spaces] but this works, too, I think.
>>> 
>>>         -- Sandro
>>> 
>>> [1] 
>>> http://lists.w3.org/Archives/Public/public-rdf-wg/2012Sep/0181.html
>>> 
>>> 
>>> 
>>>> Pat
>>>> 
>>>> 
>>>> 
>>>>>    -- Sandro
>>>>> 
>>>>> [1] 
>>>>> 
>>>>> http://www.w3.org/2005/10/Process-20051014/tr#cfr
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>> ------------------------------------------------------------
>>>> IHMC                                     (850)434 8903 or (650)494 3973   
>>>> 40 South Alcaniz St.           (850)202 4416   office
>>>> Pensacola                            (850)202 4440   fax
>>>> FL 32502                              (850)291 0667   mobile
>>>> phayesAT-SIGNihmc.us       
>>>> 
>>>> http://www.ihmc.us/users/phayes
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>> ------------------------------------------------------------
>> IHMC                                     (850)434 8903 or (650)494 3973   
>> 40 South Alcaniz St.           (850)202 4416   office
>> Pensacola                            (850)202 4440   fax
>> FL 32502                              (850)291 0667   mobile
>> phayesAT-SIGNihmc.us       
>> http://www.ihmc.us/users/phayes
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
> 

------------------------------------------------------------
IHMC                                     (850)434 8903 or (650)494 3973   
40 South Alcaniz St.           (850)202 4416   office
Pensacola                            (850)202 4440   fax
FL 32502                              (850)291 0667   mobile
phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
Received on Wednesday, 26 September 2012 22:27:10 UTC