Re: rdfs:Graph ? comment on http://www.w3.org/TR/rdf11-concepts/#section-dataset and issue 35 from Sandro Hawke on 2013-09-07 (public-rdf-comments@w3.org from September 2013)

From: Sandro Hawke <sandro@w3.org>
Date: Sat, 07 Sep 2013 11:06:03 -0400
To: Jeremy J Carroll <jjc@syapse.com>
CC: "public-rdf-comments@w3.org Comments" <public-rdf-comments@w3.org>
Message-ID: <522B40DB.8010101@w3.org>
[ I'm replying on-list, but if this is going to go back-and-forth much, 
let's move it to www-archive. ]

On 09/06/2013 02:41 PM, Jeremy J Carroll wrote:
> I remain unhappy with this resolution.
>
> I am reviewing the minutes:
> https://www.w3.org/2013/meeting/rdf-wg/2013-08-21#semantics_and_concepts
> and Peter is correct in saying:
> "Jeremy wants there to be a way to require graph names denoting graphs"

Unfortunately, that notion is surprisingly ambiguous.  More explicit 
questions below.

> I currently have a quad store with several named graphs, and some of these named graphs 'belong' to 'organizations' within my knowledge model.
> One of the graphs is named <http://my.graph.name.example.org/>, one of the organizations is syapse:stoogesInc.
>
> I wish to be able to add a triple:
>
>     <http://my.graph.name.example.org/> syapse:belongs syapse:stoogesInc .
>
> to a graph and have that triple mean, according to the RDF Semantics, RDF Concepts and RDF Vocabulary recommendations, that in an interpretation in which the triple is true that:
>     ( I(<http://my.graph.name.example.org/>), I(syapse:stoogesInc) ) is in IEXT(I(syapse:belongs))

As I understand it, that part is certainly true.   That's normal RDF 
Semantics.  The fact that http://my.graph.name.example.org/ is also used 
as a graph name doesn't relax this semantic requirement.

> and for it to not merely be an application convention that we are in fact referring to the named graph in the dataset as the subject of the triple, but for there to be some normative manner, whether formally or informally, that licenses application specific behavior involving the named graph on the basis of the truth of this triple involving the graph name.

I believe that's true as well, if we're using a loose, informal 
definition of "referring".   The specs do not 
require(<http://my.graph.name.example.org/>) = the set of triples in 
question, but they do require I(<http://my.graph.name.example.org/>) = 
something clearly paired with a particular set of triples (while in a 
particular dataset).  So in that sense the graph name can be used to 
indicate or point to those triples, even if it isn't strictly "referring".

Or something like that -- I find this way of thinking about it is too 
abstract for me to be sure I've got it right.  I prefer test cases, as 
below.

> Essentially, the promise from RDF is the ability to "say anything about anything".
> Here I seem unable to say anything at all about an RDF graph - I cannot name it in a way in which I can use the name in further RDF.
>
> Of course, there are some formal limitations such as Russell Paradox, or the Patel-Schneider paradox that constrain the bumper sticker, but the ability to make simple statements about RDF graphs does seem to be a pretty minimalist requirement.
>
> It does not seem in the spirit of RDF at all to exclude this case, which is of course common practice, because it is often necessary!
>
> While, I am sympathetic to the pressures of schedule and the difficulties of finding consensus,
> I do not believe that the WG has delivered the charter requirement of:
> "Standardize a model and semantics for multiple graphs"
> when the current semantics for multiple graphs does not support the ability to make statements about the graphs in the dataset with any normative force.
>
> An example of informal text that allows me to say something about RDF related concepts is section 1.5 of RDF concepts
> http://www.w3.org/TR/rdf11-concepts/#change-over-time
>
> This makes it clear that it is reasonable to use the URL associated, by the web architecture, with an RDF source within descriptions of that source in RDF graphs.
> I wish to have a similar set of reasonable expectations that, in some settings, I can make statements about RDF graphs within a quad store, as well.

Hmmm.     Maybe all you're looking for here is a few, informal words 
that don't affect the formal semantics?

Can you propose a specific text change that would address your concern?

What I'm most trying to understand is how your change would affect 
implementations, and how they would have to change to remain 
conformant.   If they don't have to change, then this might just be an 
editorial change we can easily include.   If they do have to change, 
then I think we'll need test cases to make the change crystal clear.

To show what I mean, here are two key test cases, reflecting the 
different ways people seem to think about and use datasets.    The key 
stumbling block in the WG was discovering that existing implementations 
relied each different outcomes for these situations:

TC1 - Does this dataset:
    PREFIX : <http://example.org/#>
    PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
    :g1 :p 1.
    GRAPH :g1 { :a :b :c }
    GRAPH :g2 { :a :b :c }
entail this:
    :g2 :p 1.
?   For you, I expect it does not.

TC2 - Given this pair of datasets:
D1:
    PREFIX : <http://example.org/#>
    PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
    GRAPH :g1 { :a :b 1 }
D2:
    PREFIX : <http://example.org/#>
    PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
    GRAPH :g1 { :a :b 2 }
Is the pair, taken together, inconsistent?    Or does the pair simply 
entail
    GRAPH :g1 { :a :b 1,2 }
?

My current thinking (not yet discussed with the WG) is that the Note 
will document URIs to be used as RDF classes, indicating which of the 
above behaviors is intended.   Specifically:

* rdf:Unified - the graph name denotes the mathematical set of triples 
paired with it in the dataset.   So if TC1 includes {:g1 a rdf:Unified.  
:g2 a rdf:Unified.}, then the given entailment would hold (otherwise it 
would not).
* rdf:Complete - the graph name denotes a "container" which holds 
exactly the triples paired with it in the dataset.  So in TC2 if either 
D1 or D2 included { :g1 a rdf:Complete }, then the datasets would be 
contradicting each other.
* rdf:Partial - the graph name denotes a "container" which holds at 
least the triples paired with it in the dataset.  In TC2, if either D1 
or D2 said { :g1 a rdf:Partial }, then the datasets could be merged as 
shown.

(The current specs say merging is application specific.   These classes 
would allow standardized merging in the cases where every graph name was 
stated as being an instance of one of these three classes.    
(rdf:Unified graphs are like rdf:Complete for merging -- the merge is 
only consistent if the triples are the same.))

Sorry if that went too far afield, but hopefully this clarifies the kind 
of input I think we need from you to know how to proceed.

> I believe that this issue is an important one and should not be hived off into an informative WG Note.

Let's come back to that when we're clear on what functionality you're 
asking for.

        -- Sandro

> Jeremy
>
>
>
>
>
> On Aug 21, 2013, at 10:11 AM, Sandro Hawke <sandro@w3.org> wrote:
>
>> Jeremy,
>>
>> Thank you very much for you comments and interest.  As you may know, the Working Group has discussed the issues around Named Graphs and Datasets extensively, for over two years now.  The Last Call version of RDF 1.1 Concepts includes a minimalist design that we believe is a reasonable foundation on which interoperable systems can be built, although in many cases those systems will require additional standards in order to have interoperability. Many designs were considered for adding more expressivity/functionality, but the Working Group decided that none of them were sufficiently mature to include in the current Recommendation-Track documents, which are to be completed by the end of the year.
>>
>> We have agreed in principle to published two Working Groups Notes to provide some guidance and tools for people to move forward with this minimal design, gathering experience with built on it, to support future standards work.  Specifically, I am drafting one, aimed at programmers using datasets for several common use cases, building on classes like your suggested rdfs:Graph.  Meanwhile, Antoine Zimmerman is drafting another, offering a framework for providing formal semantics to datasets.  As Working Group Notes, as you know, these documents can offer experimental solutions, not refined and tested to the degree required of Recommendations.   If you are interested in reviewing early drafts of the note I'm working on, or otherwise helping (test cases?), please let me know and I'll keep you in the loop.
>>
>> We understand this isn't your preferred outcome, but under the circumstances, are you satisfied with this response?
>>
>> Also: please note that the SPARQL Working Group (the successor to DAWG) has concluded.   Its comments list (which you CC'd and I'm also CC'ing) is being used for errata handling and gathering material for a possible future group.    For discussion, the right forums are probably either public-sparql-dev@w3.org or the RDF Working Group.
>>
>> On behalf of the RDF Working Group,
>>     -- Sandro
>>
>>
>>
>>
>>
>> On 07/11/2013 03:06 PM, Jeremy J Carroll wrote:
>>> Hello
>>>
>>> This is a formal comment on http://www.w3.org/TR/rdf11-concepts/#section-dataset, and it appears a comment on
>>> https://dvcs.w3.org/hg/rdf/raw-file/default/rdf-schema/index.html
>>> and quite possibly on the RDF Semantics ….
>>>
>>> It seems to be a suggestion to reopen issue 35
>>> http://www.w3.org/2011/rdf-wg/track/issues/35
>>> which points to
>>> http://www.w3.org/TR/sparql11-service-description/
>>> hence I am CC-ing dawg.
>>> The last part of this message discusses problems in using service description to meet my use case: to me, this is not a comment on DAWG's work, but a comment that RDF Core cannot use DAWG's work of more limited scope to duck the issue.
>>>
>>>
>>> Summary: I would like to use rdf to describe graphs in a dataset, e.g. to say who the author was.
>>>
>>> as a simple example
>>>
>>> my:graph {
>>>     my:graph dc:creator "Jeremy J. Carroll" .
>>> }
>>>
>>> I cannot see how to do this with the current drafts, editors drafts, etc.
>>>
>>> A possible approach would be to reopen issue 35  and have a class rdfs:Graph, s.t. for a <URI> used as the name of a graph in a dataset the triple
>>>     <URI> rdf:type rdfs:Graph
>>> holds.
>>> More weakly, I would be satisfied with such a concept being added to the RDF vocabulary, without the implication above holding, but a suggested usage pattern.
>>>
>>> Also, I basically finished this message before finding issue 35 and it's superficially reasonable resolution that sd:Graph may meet my needs. This suggests that some documentation link from either RDF Concepts or RDF Schema or RDF Semantics to SPARQL Service Description would be helpful ….
>>> However, the Service Description doc
>>> http://www.w3.org/TR/sparql11-service-description/
>>> ducks on the issue of whether the name denotes the graph, and so does not give me a clear place to put such metadata.
>>> I think if the RDF WG tried writing such documentation, they would discover that the resolution of issue 35 would unravel - the trick is to allow such unravelling without having too much of the named graphs work unravel.
>>>
>>> ----
>>>
>>>
>>> Here is my actual use case …..
>>>
>>>
>>>
>>>
>>>
>>> I first give my motivation, then I give my weak suggestion.
>>>
>>> Motivation:
>>> =========
>>>
>>> I referred to RDF Concepts 1.1 today because I am constructing an RDF dataset and wished to add metadata concerning the named graphs.
>>> I am trying to articulate a multi tenant architecture over a SPARQL end point, in which each user is assigned to a specific organization, and then depending on this organization, they have access to different named graphs.
>>>
>>> I wish to refer to the named graphs using the URI names I have assigned to them, and I wish to create my own property to add this metadata
>>>
>>>
>>> Concretely, my property might be
>>>         syapse:owningOrganization
>>>
>>> and the quads I was thinking of producing include
>>>
>>> GRAPH <https://test.syapse.com/graph/syapse> {
>>>      <https://test.syapse.com/graph/syapse> syapse:owningOrganization syapse: .
>>>       syapse:owningOrganization rdf:type owl:FunctionalProperty .
>>>       syapse:owningOrganization rdfs:range syapse:Organization .
>>>       syapse:   rdf:type syapse:Organization .
>>>       syapse:Organization rdf:type rdfs:Class .
>>>      …
>>>      …
>>> }
>>>
>>> GRAPH <https://test.syapse.com/graph/ontology/base> {
>>>      <https://test.syapse.com/graph/ontology/base> syapse:owningOrganization syapse: .
>>>      …
>>>      …
>>> }
>>>
>>> GRAPH <https://test.syapse.com/graph/ontology/sys> {
>>>      <https://test.syapse.com/graph/ontology/sys> syapse:owningOrganization syapse: .
>>>      …
>>>      …
>>> }
>>>
>>> GRAPH <https://test.syapse.com/graph/ontology/c2> {
>>>      <https://test.syapse.com/graph/ontology/c2> syapse:owningOrganization <https://test.syapse.com/graph/southParkUniversity> .
>>>      …
>>>      …
>>> }
>>>
>>> GRAPH <https://test.syapse.com/graph/southParkUniversity/abox> {
>>>      <https://test.syapse.com/graph/southParkUniversity/abox> syapse:owningOrganization <https://test.syapse.com/graph/southParkUniversity> .
>>>      <https://test.syapse.com/graph/southParkUniversity> rdf:type syapse:Organization .
>>>      …
>>>      …
>>> }
>>>
>>>
>>> This allows me to run a privileged SPARQL query across the whole dataset to find out which graphs are assigned to which organization, and then knowing which organization a user is in, I can have application logic to determine which named graphs they can access, and restrict their queries to those named graphs.
>>>
>>>
>>> Weak suggestion
>>> ==============
>>>
>>> I read the very limited text in the dataset section, and the note as reflecting a victory for those who do not want the implication that the name of the graph is a graph to hold.
>>> As a long standing advocate of the other position in which, of course, it denotes … I am somewhat disappointed.
>>>
>>> However, adding such a vocab item can allow the users to decide on a case-by-case basis whether such denotation is intended or not.
>>>
>>> e.g.
>>>
>>>     rdfs:Graph
>>>       rdfs:Graph is the class of RDF Graphs as defined by RDF Concepts.
>>>       
>>>    Semantics:
>>>
>>>     <g> { …. }
>>>
>>>     does not imply
>>>           g rdf:type rdfs:Graph
>>>
>>>
>>> but
>>>
>>>      <g> { …. } .
>>>      <g>  rdf:type rdfs:Graph
>>>
>>> does imply that the interpretation of <g> is given by the graph.
>>>
>>>
>>> Problems with the Service Description approach
>>> =====================================
>>>
>>> Reading
>>> http://www.w3.org/TR/sparql11-service-description/
>>> my understanding is that the intent is for the endpoint to provide (closed) metadata about the dataset, which does not enable further comment even from someone with update privileges on the dataset.
>>>
>>> e.g. in
>>>
>>>
>>>
>>> @prefix sd: <http://www.w3.org/ns/sparql-service-description#> .
>>> @prefix ent: <http://www.w3.org/ns/entailment/> .
>>> @prefix prof: <http://www.w3.org/ns/owl-profile/> .
>>> @prefix void: <http://rdfs.org/ns/void#> .
>>>
>>> [] a sd:Service ;
>>>      sd:endpoint <http://www.example/sparql/> ;
>>>      sd:supportedLanguage sd:SPARQL11Query ;
>>>      sd:resultFormat <http://www.w3.org/ns/formats/RDF_XML>, <http://www.w3.org/ns/formats/Turtle> ;
>>>      sd:extensionFunction <http://example.org/Distance> ;
>>>      sd:feature sd:DereferencesURIs ;
>>>      sd:defaultEntailmentRegime ent:RDFS ;
>>>      sd:defaultDataset [
>>>          a sd:Dataset ;
>>>          sd:defaultGraph [
>>>              a sd:Graph ;
>>>              void:triples 100
>>>          ] ;
>>>          sd:namedGraph [
>>>              a sd:NamedGraph ;
>>>              sd:name <http://www.example/named-graph> ;
>>>              sd:entailmentRegime ent:OWL-RDF-Based ;
>>>              sd:supportedEntailmentProfile prof:RL ;
>>>              sd:graph [
>>>                  a sd:Graph ;
>>>                  void:triples 2000
>>>              ]
>>>          ]
>>>      ] .
>>>
>>> <http://example.org/Distance> a sd:Function .
>>>
>>>
>>> The description of the named graph is attached to an explicitly blank node, that I then cannot make further comment in in my own graph or indeed inside the graph named <http://www.example/named-graph> itself.
>>> Thus I cannot add a dc:creator (or syapse:owningOrganization) triple inside this service description (because SPARQL 1.1 does not give me, nor does it intend to give me) write access to the service description, even if I have write access to <http://www.example/named-graph>
>>>
>>> These issues perhaps could be addressed by making sd:graph and sd:name  both 1-1 properties …. but I imagine there may be some reluctance ….
>>>
>>> NB - this last comment, is not a formal comment on the Service Description Spec, which seems fit-for-purpose, it is a comment on the current resolution of Issue-35 which neglects that the purpose of SPARQL Service Description is less than is needed to address the issue
>>>
>>>
>>>
>>>
>>>
>>>
>>> Jeremy J Carroll
>>> Principal Architect
>>> Syapse, Inc.
>>>
>>>
>>>
>>>
>>>
>
Received on Saturday, 7 September 2013 15:06:11 UTC