Re: rdfs:Graph ? comment on http://www.w3.org/TR/rdf11-concepts/#section-dataset and issue 35 from Pat Hayes on 2013-07-16 (public-rdf-comments@w3.org from July 2013)

From: Pat Hayes <phayes@ihmc.us>
Date: Mon, 15 Jul 2013 23:21:45 -0500
To: Jeremy J Carroll <jjc@syapse.com>
Cc: Sandro Hawke <sandro@w3.org>, "public-rdf-comments@w3.org Comments" <public-rdf-comments@w3.org>
Message-Id: <EE3082AF-5120-4779-B8A0-442EC7E71A14@ihmc.us>
On Jul 15, 2013, at 10:14 AM, Jeremy J Carroll wrote:

> 
> Hi Sandro
> 
> to reply, in turn informally

And my 2c, also informal. 

> I want to say simple stuff like who wrote a graph named in the dataset. The easiest way to do this is to attach the metadata to the name. This currently is not supported by RDF and I would like to have a clear technical explanation as to why, rather than a political rationale (which is of course totally understandable) "we didn't pick a semantics for datasets, because there are so many different ones out there already, so nothing we could pick wouldn't cause someone problems."


The basic problem is that there are two ways to think of the semantics of graph names, each one internally consistent and useful, but mutually incompatible. One is your and Sandro's preferred reading where the graph name IRI denotes the graph, and is so used in metadata expressed in RDF. The other is that the graph name is a third argument to the properties used in the named graph, and a dataset expresses trinary rather than binary predications, or (if you like) binary predications with an extra "parameter" (or you could call it a "context"; these are all formally equivalent.) For example, this would be the natural way to interpret graph names which refer to times or dates, meaning that the assertions in the graph are indexed to that time. Applications using graph "names" as parameters in this way already exist. The key point about them is that the extra argument/parameter/context does not *refer to* the graph. 

Both patterns of use are "natural", and both can be readily formalized in a semantics, but they are different. Imposing either one as normative would create problems for applications using (perhaps implicitly) the other. We were not able to invent a way to allow both without tweaking the basic RDF semantics, which would go beyond what was permitted by our (narrowly written) charter. 

I know this does not solve your problem, but I hope it helps explain why we decided to punt on this issue. 

Pat

> 
> Jeremy J Carroll
> Principal Architect
> Syapse, Inc.
> 
> 
> 
> On Jul 11, 2013, at 5:59 PM, Sandro Hawke <sandro@w3.org> wrote:
> 
>> On 07/11/2013 03:06 PM, Jeremy J Carroll wrote:
>>> 
>>> Hello
>>> 
>>> This is a formal comment on http://www.w3.org/TR/rdf11-concepts/#section-dataset, and it appears a comment on
>>> https://dvcs.w3.org/hg/rdf/raw-file/default/rdf-schema/index.html
>>> and quite possibly on the RDF Semantics ….
>> 
>> This is a brief, informal reply to both the message I'm replying to [1] and your following message [2].
>> 
>> The short answer is: we didn't pick a semantics for datasets, because there are so many different ones out there already, so nothing we could pick wouldn't cause someone problems.   So we say that datasets, on their own, have a minimal semantics plus application-specific semantic extensions.   If you want interoperability between application, you need to indicate your semantic extensions.  You can do that out-of-band (in some way you figure out) or in band, by putting some metadata in the dataset saying which semantic extensions you're using.
>> 
>> We are hoping to produce a NOTE which provides some options, so people don't have to start from scratch with these indicators.   We don't think the subject is mature enough yet to put designs in a Recommendation, though.
>> 
>> My current thinking, which the group hasn't really talked about, is:
>> 
>>  <> a rdf:WebviewDataset   (Or ResourceStateDataset or GraphStoreSnapshot)
>> 
>> would provide the semantics I think you want, where a URL graph name is associated with the graph you'd get if you dereferenced that URL.   You might think of the URLs as denoting the Web Resource whose state is represented by the associated graph.   My sense from your examples is that's how you're thinking about datasets.
>> 
>> <> a rdf:DirectDataset
>> 
>> would provide the semantics some other folks want, where the graph names actually denote the associated graphs (the pure mathematical set of triples, not a thing which can change over time).    This is what people are used to from N3 and (I think) from most provenance work.
>> 
>> I'm inclined to say DirectDataset only constrains name/graph pairs where the graph names are blank nodes and WebviewDataset only constrains name/graph pairs where the graph names are http(s) IRIs.   This would allow these two semantic extensions to be used together.   If you said:
>> 
>> <> a rdf:WebviewDataset, rdf:DirectDataset.
>> GRAPH _:a { <s> <p> 1 }
>> GRAPH <b> { <s> <p> 2}
>> _:a eg:endorsedBy eg:sandro.
>> <b> eg:endorsedBy eg:sandro.
>> 
>> Then you'd be saying I endorsed the statement {<s> <p> 1 } and I endorsed the (mutable) Web Resource <b>, whose contents happen to be { <s> <p> 2 }.     (On that latter bit, hopefully there will be some other metadata to help clarify *when* those are the contents of <b>, but we haven't figured out yet how to do that.)
>> 
>> Does that make any sense?   Does this change your comments?     I have to apologize for not having the NOTE drafted yet, and thus adding to the confusion.
>> 
>>     -- Sandro
>> 
>> 
>> [1] http://lists.w3.org/Archives/Public/public-rdf-comments/2013Jul/0021.html
>> [2] http://lists.w3.org/Archives/Public/public-rdf-comments/2013Jul/0022.html
>>> It seems to be a suggestion to reopen issue 35
>>> http://www.w3.org/2011/rdf-wg/track/issues/35
>>> which points to
>>> http://www.w3.org/TR/sparql11-service-description/
>>> hence I am CC-ing dawg.
>>> The last part of this message discusses problems in using service description to meet my use case: to me, this is not a comment on DAWG's work, but a comment that RDF Core cannot use DAWG's work of more limited scope to duck the issue.
>>> 
>>> 
>>> Summary: I would like to use rdf to describe graphs in a dataset, e.g. to say who the author was.
>>> 
>>> as a simple example
>>> 
>>> my:graph {
>>>   my:graph dc:creator "Jeremy J. Carroll" .
>>> }
>>> 
>>> I cannot see how to do this with the current drafts, editors drafts, etc.
>>> 
>>> A possible approach would be to reopen issue 35  and have a class rdfs:Graph, s.t. for a <URI> used as the name of a graph in a dataset the triple
>>>   <URI> rdf:type rdfs:Graph
>>> holds.
>>> More weakly, I would be satisfied with such a concept being added to the RDF vocabulary, without the implication above holding, but a suggested usage pattern.
>>> 
>>> Also, I basically finished this message before finding issue 35 and it's superficially reasonable resolution that sd:Graph may meet my needs. This suggests that some documentation link from either RDF Concepts or RDF Schema or RDF Semantics to SPARQL Service Description would be helpful ….
>>> However, the Service Description doc
>>> http://www.w3.org/TR/sparql11-service-description/
>>> ducks on the issue of whether the name denotes the graph, and so does not give me a clear place to put such metadata.
>>> I think if the RDF WG tried writing such documentation, they would discover that the resolution of issue 35 would unravel - the trick is to allow such unravelling without having too much of the named graphs work unravel.
>>> 
>>> ----
>>> 
>>> 
>>> Here is my actual use case …..
>>> 
>>> 
>>> 
>>> 
>>> 
>>> I first give my motivation, then I give my weak suggestion.
>>> 
>>> Motivation:
>>> =========
>>> 
>>> I referred to RDF Concepts 1.1 today because I am constructing an RDF dataset and wished to add metadata concerning the named graphs.
>>> I am trying to articulate a multi tenant architecture over a SPARQL end point, in which each user is assigned to a specific organization, and then depending on this organization, they have access to different named graphs.
>>> 
>>> I wish to refer to the named graphs using the URI names I have assigned to them, and I wish to create my own property to add this metadata
>>> 
>>> 
>>> Concretely, my property might be
>>>       syapse:owningOrganization
>>> 
>>> and the quads I was thinking of producing include
>>> 
>>> GRAPH <https://test.syapse.com/graph/syapse> {
>>>    <https://test.syapse.com/graph/syapse> syapse:owningOrganization syapse: .
>>>     syapse:owningOrganization rdf:type owl:FunctionalProperty .
>>>     syapse:owningOrganization rdfs:range syapse:Organization .
>>>     syapse:   rdf:type syapse:Organization .
>>>     syapse:Organization rdf:type rdfs:Class .
>>>    …
>>>    …
>>> }
>>> 
>>> GRAPH <https://test.syapse.com/graph/ontology/base> {
>>>    <https://test.syapse.com/graph/ontology/base> syapse:owningOrganization syapse: .
>>>    …
>>>    …
>>> }
>>> 
>>> GRAPH <https://test.syapse.com/graph/ontology/sys> {
>>>    <https://test.syapse.com/graph/ontology/sys> syapse:owningOrganization syapse: .
>>>    …
>>>    …
>>> }
>>> 
>>> GRAPH <https://test.syapse.com/graph/ontology/c2> {
>>>    <https://test.syapse.com/graph/ontology/c2> syapse:owningOrganization <https://test.syapse.com/graph/southParkUniversity> .
>>>    …
>>>    …
>>> }
>>> 
>>> GRAPH <https://test.syapse.com/graph/southParkUniversity/abox> {
>>>    <https://test.syapse.com/graph/southParkUniversity/abox> syapse:owningOrganization <https://test.syapse.com/graph/southParkUniversity> .
>>>    <https://test.syapse.com/graph/southParkUniversity> rdf:type syapse:Organization .
>>>    …
>>>    …
>>> }
>>> 
>>> 
>>> This allows me to run a privileged SPARQL query across the whole dataset to find out which graphs are assigned to which organization, and then knowing which organization a user is in, I can have application logic to determine which named graphs they can access, and restrict their queries to those named graphs.
>>> 
>>> 
>>> Weak suggestion
>>> ==============
>>> 
>>> I read the very limited text in the dataset section, and the note as reflecting a victory for those who do not want the implication that the name of the graph is a graph to hold.
>>> As a long standing advocate of the other position in which, of course, it denotes … I am somewhat disappointed.
>>> 
>>> However, adding such a vocab item can allow the users to decide on a case-by-case basis whether such denotation is intended or not.
>>> 
>>> e.g.
>>> 
>>>   rdfs:Graph
>>>     rdfs:Graph is the class of RDF Graphs as defined by RDF Concepts.
>>> 
>>>  Semantics:
>>> 
>>>   <g> { …. }
>>> 
>>>   does not imply
>>>         g rdf:type rdfs:Graph
>>> 
>>> 
>>> but
>>> 
>>>    <g> { …. } .
>>>    <g>  rdf:type rdfs:Graph
>>> 
>>> does imply that the interpretation of <g> is given by the graph.
>>> 
>>> 
>>> Problems with the Service Description approach
>>> =====================================
>>> 
>>> Reading
>>> http://www.w3.org/TR/sparql11-service-description/
>>> my understanding is that the intent is for the endpoint to provide (closed) metadata about the dataset, which does not enable further comment even from someone with update privileges on the dataset.
>>> 
>>> e.g. in
>>> 
>>> 
>>> 
>>> @prefix sd: <http://www.w3.org/ns/sparql-service-description#> .
>>> @prefix ent: <http://www.w3.org/ns/entailment/> .
>>> @prefix prof: <http://www.w3.org/ns/owl-profile/> .
>>> @prefix void: <http://rdfs.org/ns/void#> .
>>> 
>>> [] a sd:Service ;
>>>    sd:endpoint <http://www.example/sparql/> ;
>>>    sd:supportedLanguage sd:SPARQL11Query ;
>>>    sd:resultFormat <http://www.w3.org/ns/formats/RDF_XML>, <http://www.w3.org/ns/formats/Turtle> ;
>>>    sd:extensionFunction <http://example.org/Distance> ;
>>>    sd:feature sd:DereferencesURIs ;
>>>    sd:defaultEntailmentRegime ent:RDFS ;
>>>    sd:defaultDataset [
>>>        a sd:Dataset ;
>>>        sd:defaultGraph [
>>>            a sd:Graph ;
>>>            void:triples 100
>>>        ] ;
>>>        sd:namedGraph [
>>>            a sd:NamedGraph ;
>>>            sd:name <http://www.example/named-graph> ;
>>>            sd:entailmentRegime ent:OWL-RDF-Based ;
>>>            sd:supportedEntailmentProfile prof:RL ;
>>>            sd:graph [
>>>                a sd:Graph ;
>>>                void:triples 2000
>>>            ]
>>>        ]
>>>    ] .
>>> 
>>> <http://example.org/Distance> a sd:Function .
>>> 
>>> 
>>> The description of the named graph is attached to an explicitly blank node, that I then cannot make further comment in in my own graph or indeed inside the graph named <http://www.example/named-graph> itself.
>>> Thus I cannot add a dc:creator (or syapse:owningOrganization) triple inside this service description (because SPARQL 1.1 does not give me, nor does it intend to give me) write access to the service description, even if I have write access to <http://www.example/named-graph>
>>> 
>>> These issues perhaps could be addressed by making sd:graph and sd:name  both 1-1 properties …. but I imagine there may be some reluctance ….
>>> 
>>> NB - this last comment, is not a formal comment on the Service Description Spec, which seems fit-for-purpose, it is a comment on the current resolution of Issue-35 which neglects that the purpose of SPARQL Service Description is less than is needed to address the issue
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> Jeremy J Carroll
>>> Principal Architect
>>> Syapse, Inc.
>>> 
>>> 
>>> 
>>> 
>>> 
>> 
> 
> 
> 

------------------------------------------------------------
IHMC                                     (850)434 8903 or (650)494 3973   
40 South Alcaniz St.           (850)202 4416   office
Pensacola                            (850)202 4440   fax
FL 32502                              (850)291 0667   mobile
phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
Received on Tuesday, 16 July 2013 04:22:12 UTC