Re: Bound and Unbound Datasets from Pat Hayes on 2013-06-04 (public-rdf-wg@w3.org from June 2013)

From: Pat Hayes <phayes@ihmc.us>
Date: Tue, 4 Jun 2013 12:35:12 -0500
To: Sandro Hawke <sandro@w3.org>
Cc: W3C RDF WG <public-rdf-wg@w3.org>
Message-Id: <598FF1A7-C0AF-446B-8330-101731BD0622@ihmc.us>
On Jun 3, 2013, at 5:02 PM, Sandro Hawke wrote:

> On 06/03/2013 10:46 AM, Pat Hayes wrote:
>> On Jun 3, 2013, at 10:34 AM, Sandro Hawke wrote:
>> 
>>> It looks to me like we have two very different camps concerning datasets.    ISSUE-131 has brought this to light again, but the camps long predate that issue.  The division is between the people who have been using datasets with application-dependent semantics for a long time and the people who want to build things which require standard interoperable semantics for datasets.    I'm in a latter camp, and was arguing for it for a long time, but I decided some months ago I could live without standard semantics via a very convoluted mechanism.  I agreed to document that mechanism, but as I have contemplated doing so, I've been dragging my feet because it's pretty weird and I think the group wont like it.    (Talking off-list to Pat about it yesterday, I think it's safe to say he hated it.)
>> You betcha.
>> 
>>> So I have an alternative proposal.  Let's have two kinds of datasets:
>>> 
>>> * "Unbound" datasets are what's been in SPARQL and rdf-concepts so far.   According to the standard they are just structure, with no semantics.  In practice, their semantics are determined by the application in which they are used.
>>> 
>>> * "Bound" datasets have the following semantics:
>>>      (1) for the dataset to be true, the default graph must be true;
>> But with a slight tweak, see below.
>> 
>>>      (2) graph names denote the graphs they are paired with.
>>> 
>>> I suggest we indicate a dataset is bound by putting the magic triple { <> a rdf:BoundDataset } in its default graph.   (This triple would be treated specially in the RDF semantics for any system which implements/recognizes bound datasets; to other systems (eg SPARQL) it's just another triple.)
>> We could treat this in the following "context" way, that IRIs (and bnodes?) which occur as graph labels in the dataset are interpreted **in the default graph** as denoting the graphs they label. That is, *just* in the default graph. That then allows the use of IRIs which 'globally' denote something else to still be used as graph labels, without breaking the semantics. That would allow a (very limited and special-purpose) kind of punning to be used in default graphs for the purposes of graph identification.
> 
> Why do you think people want this feature?   I'm not seeing how it's very useful.

Well, the (only?) reason we don't just say as a blanket rule that any graph label must denote the graph it labels, is because some people were very vocal about wanting to use IRIs which denoted something else to be used as graph labels. So I thought this was a clear use case that it would be helpful if we could allow, and this tweak does allow it. And it seem(ed) harmless because I thought you only wanted to put metadata into the default graph. But in your (Alice said Bob said ...) example, I see that was a mistaken impression on my part. Yes, if you want graph labels to be universally denoting the graphs they label, then forget my suggestion. 

It might have been too cute in any case, punning is always dangerous. 

So I withdraw the suggestion. Lets keep it simple. The semantics of { name {graph}}    is   I(name)=graph, end of story.

>> I think this would be the most useful way to handle this.
>> 
>> Technically, the semantics of a bound dataset is: define the binding map B to be the function from IRIs (and bnodes?) used as graph labels to the graphs they label. Then the dataset is true in I just when the default graph D is true in I/B, ie the interpretation which is just like I except it maps IRIs (and bnodes?) in the domain of B to their B value.
> 
> So how would I say "Alice says Bob says a-b-c"?   I'd expect to write:
> 
> { :alice :says :g1 }
> :g1 { :bob :says :g2 }
> :g2 { :a :b :c }
> 
> but I think with your proposal my two occurrences of :g2 would not be connected.

Indeed they would not. (But do you seriously see this as a likely use case?)

> 
> Or maybe they would be, due to whatever semantics get provided for :says ?

Not really. :says has to take what it is given, and unless some dataset rule says that the graph name is referentially linked to its graph, that link isn't going to be in place. The best you could do is to say that :says requires **a** graph in its range, but that won't hack it. 

Pat

> 
>>>  If a dataset does not have this flag, it's unbound.   Of course, being unbound, it has application-specific semantics and so an application may choose to treat it as bound.
>>> 
>>> I think this would solve a lot of problems, and not raise too many.
>> I agree.  It does mean, c.f. Gregg's comment, that merging datasets requires paying attention to this flag and treating it seriously, as its presence/absence can change IRI denotations. But I dont see any way to have complete freedom to merge datasets in any case if they might be using application-dependent semantics.
> 
> Yes, it seems to me that it's fairly easy to merge bound datasets but that it's simply impossible to merge unbound datasets while preserving their application-dependent semantics unless you know/implement those semantics.
> 
>>>     I expect many of the folks who wanted us to standardize named graphs, fix reification, etc, when this group was chartered, would much prefer having this option to having only the half-solution that's in our specs now.
>> I wholeheartedly agree.
> 
> :-)
> 
>       - s
>> Pat
>> 
>>>      -- Sandro
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>> ------------------------------------------------------------
>> IHMC                                     (850)434 8903 or (650)494 3973
>> 40 South Alcaniz St.           (850)202 4416   office
>> Pensacola                            (850)202 4440   fax
>> FL 32502                              (850)291 0667   mobile
>> phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
>> 
>> 
>> 
>> 
>> 
>> 
> 
> 
> 

------------------------------------------------------------
IHMC                                     (850)434 8903 or (650)494 3973   
40 South Alcaniz St.           (850)202 4416   office
Pensacola                            (850)202 4440   fax
FL 32502                              (850)291 0667   mobile
phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
Received on Tuesday, 4 June 2013 17:35:42 UTC