Re: Bound and Unbound Datasets from Sandro Hawke on 2013-06-03 (public-rdf-wg@w3.org from June 2013)

From: Sandro Hawke <sandro@w3.org>
Date: Mon, 03 Jun 2013 15:02:50 -0700
To: Pat Hayes <phayes@ihmc.us>
CC: W3C RDF WG <public-rdf-wg@w3.org>
Message-ID: <51AD128A.3060603@w3.org>
On 06/03/2013 10:46 AM, Pat Hayes wrote:
> On Jun 3, 2013, at 10:34 AM, Sandro Hawke wrote:
>
>> It looks to me like we have two very different camps concerning datasets.    ISSUE-131 has brought this to light again, but the camps long predate that issue.  The division is between the people who have been using datasets with application-dependent semantics for a long time and the people who want to build things which require standard interoperable semantics for datasets.    I'm in a latter camp, and was arguing for it for a long time, but I decided some months ago I could live without standard semantics via a very convoluted mechanism.  I agreed to document that mechanism, but as I have contemplated doing so, I've been dragging my feet because it's pretty weird and I think the group wont like it.    (Talking off-list to Pat about it yesterday, I think it's safe to say he hated it.)
> You betcha.
>
>> So I have an alternative proposal.  Let's have two kinds of datasets:
>>
>> * "Unbound" datasets are what's been in SPARQL and rdf-concepts so far.   According to the standard they are just structure, with no semantics.  In practice, their semantics are determined by the application in which they are used.
>>
>> * "Bound" datasets have the following semantics:
>>       (1) for the dataset to be true, the default graph must be true;
> But with a slight tweak, see below.
>
>>       (2) graph names denote the graphs they are paired with.
>>
>> I suggest we indicate a dataset is bound by putting the magic triple { <> a rdf:BoundDataset } in its default graph.   (This triple would be treated specially in the RDF semantics for any system which implements/recognizes bound datasets; to other systems (eg SPARQL) it's just another triple.)
> We could treat this in the following "context" way, that IRIs (and bnodes?) which occur as graph labels in the dataset are interpreted **in the default graph** as denoting the graphs they label. That is, *just* in the default graph. That then allows the use of IRIs which 'globally' denote something else to still be used as graph labels, without breaking the semantics. That would allow a (very limited and special-purpose) kind of punning to be used in default graphs for the purposes of graph identification.

Why do you think people want this feature?   I'm not seeing how it's 
very useful.

> I think this would be the most useful way to handle this.
>
> Technically, the semantics of a bound dataset is: define the binding map B to be the function from IRIs (and bnodes?) used as graph labels to the graphs they label. Then the dataset is true in I just when the default graph D is true in I/B, ie the interpretation which is just like I except it maps IRIs (and bnodes?) in the domain of B to their B value.

So how would I say "Alice says Bob says a-b-c"?   I'd expect to write:

{ :alice :says :g1 }
:g1 { :bob :says :g2 }
:g2 { :a :b :c }

but I think with your proposal my two occurrences of :g2 would not be 
connected.

Or maybe they would be, due to whatever semantics get provided for :says ?

>>   If a dataset does not have this flag, it's unbound.   Of course, being unbound, it has application-specific semantics and so an application may choose to treat it as bound.
>>
>> I think this would solve a lot of problems, and not raise too many.
> I agree.  It does mean, c.f. Gregg's comment, that merging datasets requires paying attention to this flag and treating it seriously, as its presence/absence can change IRI denotations. But I dont see any way to have complete freedom to merge datasets in any case if they might be using application-dependent semantics.

Yes, it seems to me that it's fairly easy to merge bound datasets but 
that it's simply impossible to merge unbound datasets while preserving 
their application-dependent semantics unless you know/implement those 
semantics.

>>      I expect many of the folks who wanted us to standardize named graphs, fix reification, etc, when this group was chartered, would much prefer having this option to having only the half-solution that's in our specs now.
> I wholeheartedly agree.

:-)

        - s
> Pat
>
>>       -- Sandro
>>
>>
>>
>>
>>
>>
>>
> ------------------------------------------------------------
> IHMC                                     (850)434 8903 or (650)494 3973
> 40 South Alcaniz St.           (850)202 4416   office
> Pensacola                            (850)202 4440   fax
> FL 32502                              (850)291 0667   mobile
> phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
>
>
>
>
>
>
Received on Monday, 3 June 2013 22:03:03 UTC