- From: Antoine Zimmermann <antoine.zimmermann@emse.fr>
- Date: Tue, 16 Jul 2013 15:05:23 +0200
- To: public-rdf-wg@w3.org
About the Note on dataset semantics: I am planning to get this done (at least have a more or less complete draft) by the end of the month, hopefully. I thought I could get to it earlier but it's surprising how much things a teacher-researcher has to do when no teaching has to be done... AZ Le 16/07/2013 14:35, Sandro Hawke a écrit : > On 07/16/2013 12:21 AM, Pat Hayes wrote: >> On Jul 15, 2013, at 10:14 AM, Jeremy J Carroll wrote: >> >>> Hi Sandro >>> >>> to reply, in turn informally >> And my 2c, also informal. > > I wonder if we should loosen the dont-reply-to-comments rule. A few > like this seems fine, but I'm enjoying not having the out-of-control > threads. I replied more to Jeremy, here: > http://lists.w3.org/Archives/Public/www-archive/2013Jul/0030.html > >>> I want to say simple stuff like who wrote a graph named in the >>> dataset. The easiest way to do this is to attach the metadata to the >>> name. This currently is not supported by RDF and I would like to have >>> a clear technical explanation as to why, rather than a political >>> rationale (which is of course totally understandable) "we didn't pick >>> a semantics for datasets, because there are so many different ones >>> out there already, so nothing we could pick wouldn't cause someone >>> problems." >> >> The basic problem is that there are two ways to think of the semantics >> of graph names, each one internally consistent and useful, but >> mutually incompatible. One is your and Sandro's preferred reading >> where the graph name IRI denotes the graph, and is so used in metadata >> expressed in RDF. The other is that the graph name is a third argument >> to the properties used in the named graph, and a dataset expresses >> trinary rather than binary predications, or (if you like) binary >> predications with an extra "parameter" (or you could call it a >> "context"; these are all formally equivalent.) For example, this would >> be the natural way to interpret graph names which refer to times or >> dates, meaning that the assertions in the graph are indexed to that >> time. Applications using graph "names" as parameters in this way >> already exist. The key point about them is that the extra >> argument/parameter/context does not *refer to* the graph. > > Very interesting. Yes, I think I understand that a little better > now. It's much more sophisticated than my example in the > above-referenced email about gbox-vs-gsnap reference. >> Both patterns of use are "natural", and both can be readily formalized >> in a semantics, but they are different. Imposing either one as >> normative would create problems for applications using (perhaps >> implicitly) the other. We were not able to invent a way to allow both >> without tweaking the basic RDF semantics, which would go beyond what >> was permitted by our (narrowly written) charter. >> >> I know this does not solve your problem, but I hope it helps explain >> why we decided to punt on this issue. > > In his reply to my reply, Jeremy also said, quite reasonably: > > I would like to have a clear technical explanation as to why, rather > than a political rationale (which is of course totally understandable) > "we didn't pick a semantics for datasets, because there are so many > different ones out there already, so nothing we could pick wouldn't > cause someone problems." > > > I guess that would go in the Dataset Semantics NOTE. [Except that > we're talked about two of these (one by me, focusing on programming, and > one by Antoine, focusing on formal semantics), and I'm not sure either > will get written. :-( ] > > -- Sandro > >> Pat >> >>> Jeremy J Carroll >>> Principal Architect >>> Syapse, Inc. >>> >>> >>> >>> On Jul 11, 2013, at 5:59 PM, Sandro Hawke <sandro@w3.org> wrote: >>> >>>> On 07/11/2013 03:06 PM, Jeremy J Carroll wrote: >>>>> Hello >>>>> >>>>> This is a formal comment on >>>>> http://www.w3.org/TR/rdf11-concepts/#section-dataset, and it >>>>> appears a comment on >>>>> https://dvcs.w3.org/hg/rdf/raw-file/default/rdf-schema/index.html >>>>> and quite possibly on the RDF Semantics …. >>>> This is a brief, informal reply to both the message I'm replying to >>>> [1] and your following message [2]. >>>> >>>> The short answer is: we didn't pick a semantics for datasets, >>>> because there are so many different ones out there already, so >>>> nothing we could pick wouldn't cause someone problems. So we say >>>> that datasets, on their own, have a minimal semantics plus >>>> application-specific semantic extensions. If you want >>>> interoperability between application, you need to indicate your >>>> semantic extensions. You can do that out-of-band (in some way you >>>> figure out) or in band, by putting some metadata in the dataset >>>> saying which semantic extensions you're using. >>>> >>>> We are hoping to produce a NOTE which provides some options, so >>>> people don't have to start from scratch with these indicators. We >>>> don't think the subject is mature enough yet to put designs in a >>>> Recommendation, though. >>>> >>>> My current thinking, which the group hasn't really talked about, is: >>>> >>>> <> a rdf:WebviewDataset (Or ResourceStateDataset or >>>> GraphStoreSnapshot) >>>> >>>> would provide the semantics I think you want, where a URL graph name >>>> is associated with the graph you'd get if you dereferenced that >>>> URL. You might think of the URLs as denoting the Web Resource >>>> whose state is represented by the associated graph. My sense from >>>> your examples is that's how you're thinking about datasets. >>>> >>>> <> a rdf:DirectDataset >>>> >>>> would provide the semantics some other folks want, where the graph >>>> names actually denote the associated graphs (the pure mathematical >>>> set of triples, not a thing which can change over time). This is >>>> what people are used to from N3 and (I think) from most provenance >>>> work. >>>> >>>> I'm inclined to say DirectDataset only constrains name/graph pairs >>>> where the graph names are blank nodes and WebviewDataset only >>>> constrains name/graph pairs where the graph names are http(s) >>>> IRIs. This would allow these two semantic extensions to be used >>>> together. If you said: >>>> >>>> <> a rdf:WebviewDataset, rdf:DirectDataset. >>>> GRAPH _:a { <s> <p> 1 } >>>> GRAPH <b> { <s> <p> 2} >>>> _:a eg:endorsedBy eg:sandro. >>>> <b> eg:endorsedBy eg:sandro. >>>> >>>> Then you'd be saying I endorsed the statement {<s> <p> 1 } and I >>>> endorsed the (mutable) Web Resource <b>, whose contents happen to be >>>> { <s> <p> 2 }. (On that latter bit, hopefully there will be some >>>> other metadata to help clarify *when* those are the contents of <b>, >>>> but we haven't figured out yet how to do that.) >>>> >>>> Does that make any sense? Does this change your comments? I >>>> have to apologize for not having the NOTE drafted yet, and thus >>>> adding to the confusion. >>>> >>>> -- Sandro >>>> >>>> >>>> [1] >>>> http://lists.w3.org/Archives/Public/public-rdf-comments/2013Jul/0021.html >>>> >>>> [2] >>>> http://lists.w3.org/Archives/Public/public-rdf-comments/2013Jul/0022.html >>>> >>>>> It seems to be a suggestion to reopen issue 35 >>>>> http://www.w3.org/2011/rdf-wg/track/issues/35 >>>>> which points to >>>>> http://www.w3.org/TR/sparql11-service-description/ >>>>> hence I am CC-ing dawg. >>>>> The last part of this message discusses problems in using service >>>>> description to meet my use case: to me, this is not a comment on >>>>> DAWG's work, but a comment that RDF Core cannot use DAWG's work of >>>>> more limited scope to duck the issue. >>>>> >>>>> >>>>> Summary: I would like to use rdf to describe graphs in a dataset, >>>>> e.g. to say who the author was. >>>>> >>>>> as a simple example >>>>> >>>>> my:graph { >>>>> my:graph dc:creator "Jeremy J. Carroll" . >>>>> } >>>>> >>>>> I cannot see how to do this with the current drafts, editors >>>>> drafts, etc. >>>>> >>>>> A possible approach would be to reopen issue 35 and have a class >>>>> rdfs:Graph, s.t. for a <URI> used as the name of a graph in a >>>>> dataset the triple >>>>> <URI> rdf:type rdfs:Graph >>>>> holds. >>>>> More weakly, I would be satisfied with such a concept being added >>>>> to the RDF vocabulary, without the implication above holding, but a >>>>> suggested usage pattern. >>>>> >>>>> Also, I basically finished this message before finding issue 35 and >>>>> it's superficially reasonable resolution that sd:Graph may meet my >>>>> needs. This suggests that some documentation link from either RDF >>>>> Concepts or RDF Schema or RDF Semantics to SPARQL Service >>>>> Description would be helpful …. >>>>> However, the Service Description doc >>>>> http://www.w3.org/TR/sparql11-service-description/ >>>>> ducks on the issue of whether the name denotes the graph, and so >>>>> does not give me a clear place to put such metadata. >>>>> I think if the RDF WG tried writing such documentation, they would >>>>> discover that the resolution of issue 35 would unravel - the trick >>>>> is to allow such unravelling without having too much of the named >>>>> graphs work unravel. >>>>> >>>>> ---- >>>>> >>>>> >>>>> Here is my actual use case ….. >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> I first give my motivation, then I give my weak suggestion. >>>>> >>>>> Motivation: >>>>> ========= >>>>> >>>>> I referred to RDF Concepts 1.1 today because I am constructing an >>>>> RDF dataset and wished to add metadata concerning the named graphs. >>>>> I am trying to articulate a multi tenant architecture over a SPARQL >>>>> end point, in which each user is assigned to a specific >>>>> organization, and then depending on this organization, they have >>>>> access to different named graphs. >>>>> >>>>> I wish to refer to the named graphs using the URI names I have >>>>> assigned to them, and I wish to create my own property to add this >>>>> metadata >>>>> >>>>> >>>>> Concretely, my property might be >>>>> syapse:owningOrganization >>>>> >>>>> and the quads I was thinking of producing include >>>>> >>>>> GRAPH <https://test.syapse.com/graph/syapse> { >>>>> <https://test.syapse.com/graph/syapse> >>>>> syapse:owningOrganization syapse: . >>>>> syapse:owningOrganization rdf:type owl:FunctionalProperty . >>>>> syapse:owningOrganization rdfs:range syapse:Organization . >>>>> syapse: rdf:type syapse:Organization . >>>>> syapse:Organization rdf:type rdfs:Class . >>>>> … >>>>> … >>>>> } >>>>> >>>>> GRAPH <https://test.syapse.com/graph/ontology/base> { >>>>> <https://test.syapse.com/graph/ontology/base> >>>>> syapse:owningOrganization syapse: . >>>>> … >>>>> … >>>>> } >>>>> >>>>> GRAPH <https://test.syapse.com/graph/ontology/sys> { >>>>> <https://test.syapse.com/graph/ontology/sys> >>>>> syapse:owningOrganization syapse: . >>>>> … >>>>> … >>>>> } >>>>> >>>>> GRAPH <https://test.syapse.com/graph/ontology/c2> { >>>>> <https://test.syapse.com/graph/ontology/c2> >>>>> syapse:owningOrganization >>>>> <https://test.syapse.com/graph/southParkUniversity> . >>>>> … >>>>> … >>>>> } >>>>> >>>>> GRAPH <https://test.syapse.com/graph/southParkUniversity/abox> { >>>>> <https://test.syapse.com/graph/southParkUniversity/abox> >>>>> syapse:owningOrganization >>>>> <https://test.syapse.com/graph/southParkUniversity> . >>>>> <https://test.syapse.com/graph/southParkUniversity> rdf:type >>>>> syapse:Organization . >>>>> … >>>>> … >>>>> } >>>>> >>>>> >>>>> This allows me to run a privileged SPARQL query across the whole >>>>> dataset to find out which graphs are assigned to which >>>>> organization, and then knowing which organization a user is in, I >>>>> can have application logic to determine which named graphs they can >>>>> access, and restrict their queries to those named graphs. >>>>> >>>>> >>>>> Weak suggestion >>>>> ============== >>>>> >>>>> I read the very limited text in the dataset section, and the note >>>>> as reflecting a victory for those who do not want the implication >>>>> that the name of the graph is a graph to hold. >>>>> As a long standing advocate of the other position in which, of >>>>> course, it denotes … I am somewhat disappointed. >>>>> >>>>> However, adding such a vocab item can allow the users to decide on >>>>> a case-by-case basis whether such denotation is intended or not. >>>>> >>>>> e.g. >>>>> >>>>> rdfs:Graph >>>>> rdfs:Graph is the class of RDF Graphs as defined by RDF Concepts. >>>>> >>>>> Semantics: >>>>> >>>>> <g> { …. } >>>>> >>>>> does not imply >>>>> g rdf:type rdfs:Graph >>>>> >>>>> >>>>> but >>>>> >>>>> <g> { …. } . >>>>> <g> rdf:type rdfs:Graph >>>>> >>>>> does imply that the interpretation of <g> is given by the graph. >>>>> >>>>> >>>>> Problems with the Service Description approach >>>>> ===================================== >>>>> >>>>> Reading >>>>> http://www.w3.org/TR/sparql11-service-description/ >>>>> my understanding is that the intent is for the endpoint to provide >>>>> (closed) metadata about the dataset, which does not enable further >>>>> comment even from someone with update privileges on the dataset. >>>>> >>>>> e.g. in >>>>> >>>>> >>>>> >>>>> @prefix sd: <http://www.w3.org/ns/sparql-service-description#> . >>>>> @prefix ent: <http://www.w3.org/ns/entailment/> . >>>>> @prefix prof: <http://www.w3.org/ns/owl-profile/> . >>>>> @prefix void: <http://rdfs.org/ns/void#> . >>>>> >>>>> [] a sd:Service ; >>>>> sd:endpoint <http://www.example/sparql/> ; >>>>> sd:supportedLanguage sd:SPARQL11Query ; >>>>> sd:resultFormat <http://www.w3.org/ns/formats/RDF_XML>, >>>>> <http://www.w3.org/ns/formats/Turtle> ; >>>>> sd:extensionFunction <http://example.org/Distance> ; >>>>> sd:feature sd:DereferencesURIs ; >>>>> sd:defaultEntailmentRegime ent:RDFS ; >>>>> sd:defaultDataset [ >>>>> a sd:Dataset ; >>>>> sd:defaultGraph [ >>>>> a sd:Graph ; >>>>> void:triples 100 >>>>> ] ; >>>>> sd:namedGraph [ >>>>> a sd:NamedGraph ; >>>>> sd:name <http://www.example/named-graph> ; >>>>> sd:entailmentRegime ent:OWL-RDF-Based ; >>>>> sd:supportedEntailmentProfile prof:RL ; >>>>> sd:graph [ >>>>> a sd:Graph ; >>>>> void:triples 2000 >>>>> ] >>>>> ] >>>>> ] . >>>>> >>>>> <http://example.org/Distance> a sd:Function . >>>>> >>>>> >>>>> The description of the named graph is attached to an explicitly >>>>> blank node, that I then cannot make further comment in in my own >>>>> graph or indeed inside the graph named >>>>> <http://www.example/named-graph> itself. >>>>> Thus I cannot add a dc:creator (or syapse:owningOrganization) >>>>> triple inside this service description (because SPARQL 1.1 does not >>>>> give me, nor does it intend to give me) write access to the service >>>>> description, even if I have write access to >>>>> <http://www.example/named-graph> >>>>> >>>>> These issues perhaps could be addressed by making sd:graph and >>>>> sd:name both 1-1 properties …. but I imagine there may be some >>>>> reluctance …. >>>>> >>>>> NB - this last comment, is not a formal comment on the Service >>>>> Description Spec, which seems fit-for-purpose, it is a comment on >>>>> the current resolution of Issue-35 which neglects that the purpose >>>>> of SPARQL Service Description is less than is needed to address the >>>>> issue >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> Jeremy J Carroll >>>>> Principal Architect >>>>> Syapse, Inc. >>>>> >>>>> >>>>> >>>>> >>>>> >>> >>> >> ------------------------------------------------------------ >> IHMC (850)434 8903 or (650)494 3973 >> 40 South Alcaniz St. (850)202 4416 office >> Pensacola (850)202 4440 fax >> FL 32502 (850)291 0667 mobile >> phayesAT-SIGNihmc.us http://www.ihmc.us/users/phayes >> >> >> >> >> >> > > -- Antoine Zimmermann ISCOD / LSTI - Institut Henri Fayol École Nationale Supérieure des Mines de Saint-Étienne 158 cours Fauriel 42023 Saint-Étienne Cedex 2 France Tél:+33(0)4 77 42 66 03 Fax:+33(0)4 77 42 66 66 http://zimmer.aprilfoolsreview.com/
Received on Tuesday, 16 July 2013 13:01:23 UTC