Re: different Semantics proposals (Re: Agenda for 19 Sep 2012) from Sandro Hawke on 2012-09-18 (public-rdf-wg@w3.org from September 2012)

From: Sandro Hawke <sandro@w3.org>
Date: Tue, 18 Sep 2012 10:40:08 -0400
To: "Peter F. Patel-Schneider" <pfpschneider@gmail.com>
CC: David Wood <david@3roundstones.com>, "public-rdf-wg@w3.org Group WG" <public-rdf-wg@w3.org>
Message-ID: <505887C8.8000004@w3.org>
On 09/18/2012 09:27 AM, Peter F. Patel-Schneider wrote:
>
> On 09/18/2012 09:12 AM, Sandro Hawke wrote:
>> On 09/18/2012 09:05 AM, Peter F. Patel-Schneider wrote:
>>>
>>> On 09/17/2012 04:46 PM, Sandro Hawke wrote:
>>>> On 09/17/2012 02:02 PM, Peter F. Patel-Schneider wrote:
>>>>
>>> [...]
>>>>
>>>> Can you be a little more specific, and tell a story about something 
>>>> specific someone is likely to want to do that they could do with 
>>>> your proposed semantics and not with the proposal on the agenda?
>>>>
>>>> (The two things I see are: (1) the default graph being "asserted", 
>>>> which seems easy enough to work around if desired [just use a named 
>>>> graph], and (2) URIs being interpreted the same way throughout the 
>>>> dataset... but I can't see what harm that could cause.   Maybe I'm 
>>>> on the wrong track.  Okay, I'm also concerned about 
>>>> unwanted-but-valid inference being done, but that's an issue 
>>>> throughout RDF, not just about datasets.)
>>>>
>>>>       -- Sandro
>>>>
>>>
>>> (2) I don't know where in the minimal semantics there is a notion 
>>> that IRIs have to be interpreted the same way throughout the 
>>> dataset, so I don't see any difference here. If, however, there is a 
>>> need to interpret IRIs the same way throughout a dataset then this 
>>> would indeed be a vast difference, essentially requiring rigid 
>>> designators in datasets.   This would mean that any equality 
>>> assertion in the default graph would carry over into the named 
>>> graphs (and maybe vice versa).
>>>
>>
>> Sorry, I just meant the IRIs of the named graphs, the n's in the 
>> <n,g> pairs, being interpreted the same as IRIs the default graph.
>
> OK, so you are referring to the part of the semantics where it is the 
> denotation of the graph names in the default graph that is used as the 
> start of the mapping to the named graph itself.  I am against this 
> because there can be strange bleeding from the default graph to the 
> identity of the named graphs, such as in example 2.16 in 
> http://www.w3.org/2011/rdf-wg/wiki/TF-Graphs/Minimal-dataset-semantics
> although the analysis there is incorrect.
>
> if all you are dong is recording named graphs, then why should 
> information in the default graph potentially cause two named graphs to 
> be smushed together?
>

How could it not?

If I know
   <g1> { ... }
   <g2> { ... }

and if I know, because of some metadata, that g1 and g2 are actually the 
same thing, doesn't that imply some kind of smushing or inconsistency?

>>> (1) Even if you used an empty default graph, you get some carry-over 
>>> into the named graphs.   For example, the named graph resources can 
>>> only be taken from the resources in this interpretation. Fortunately 
>>> (or unfortunately) all RDF interpretations are infinite, so there 
>>> probably are no observable consequences.
>>>
>>> But in any case, why should I be forced into turning my default 
>>> graph into a named graph (with some arbitrary name) and adding an 
>>> empty default graph?
>>>
>>>
>>> One interesting use of RDF datasets is to collect information from 
>>> the web.   The named graphs record the source of the graphs and 
>>> their contents.  The default graph can either be related to these 
>>> collected graphs or unrelated to them. Having the default graph 
>>> affect the meaning of the named graphs is undesired.
>>>
>>
>> I don't see how you can usefully communicate collected information 
>> like that unless you have a private protocol arranged (in which case 
>> this is all moot), or you use the default graph for metadata.
>>
>>        -- Sandro
>>
>
> I would turn this question around and ask how the current minimal 
> semantics can be used for the same purpose.
>

I'm not sure I can answer that.   The one thing I can argue for, I 
think, is that the graph names be strongly connected to those same names 
being used in the default graph.   For example, I think we need to be 
able to say things like this:

    <http://example.org/d1> { <a> <b> 1 }.
    <http://example.org/d1> eg1:lastModified "Wed, 12 Sep 2012 11:42:31
    GMT".


where eg1:lastModified is defined such that this dataset conveys the 
knowledge that (1) a dereference on URL "http://example.org/d1" was 
done; (2) it resulted in (at least?) the triple
{ <a> <b> 1 }; and (3) the HTTP Last-Modified header returned during 
that dereference was the string "Wed, 12 Sep 2012 11:42:31 GMT".

(I'm not suggesting this group standardize eg1:lastModified, or that 
this is a good way to convey this information; I'm just saying that I 
think someone, someday needs to be able to define vocabularies like 
this, given the work we are doing now.  That seems to me to be our key 
graph-semantics deliverable.)

> I also don't see why excluding useful private ways of doing things 
> (particularly ones that might already be in use) is moot.
>

Well, I think these standards only apply between systems, not within 
systems.  Private communications are essentially system-internals, and 
none of our business.  Right?

> I didn't exclude using the default graph to record information about 
> the named graphs and their sources.  However, I didn't want 
> information in the default graph to affect the situation in the named 
> graphs.
>

My concern/motivation is in my example above.   The semantics need to be 
strong enough that eg1:lastModified can be defined to make my example 
dataset mean to consumers what the producer wants it to mean.

I leave any argument for more powerful semantics to others, who have 
other use cases in mind.

      -- Sandro

> peter
>
>
Received on Tuesday, 18 September 2012 14:40:25 UTC