- From: Sandro Hawke <sandro@w3.org>
- Date: Wed, 02 May 2012 09:55:57 -0400
- To: Antoine Zimmermann <antoine.zimmermann@emse.fr>
- Cc: Pat Hayes <phayes@ihmc.us>, RDF WG <public-rdf-wg@w3.org>
On Wed, 2012-05-02 at 15:29 +0200, Antoine Zimmermann wrote: > PS: Ok, by writing this email thoughts came to me and I believe I better > see each party's opinions and goals. Sorry if this re-asserts some > things that were made explicit in earlier discussions. > There are parts mostly directed to Pat, but the end is certainly more > interesting to others, especially I think Sandro. A partial reply - I don't think I'll have time to do more before the meeting. > > Le 30/04/2012 19:53, Pat Hayes a écrit : > > > > > >> > >>> Seems to me that this analogy strongly supports Sandro's notion > >>> of graph names as being, well, names of graphs. > >>> > >>> But we can take your view, as I understand it. It is simply a > >>> rejection of the very idea of datasets having any normative > >>> semantics or meaning. They are just handy datastructures for > >>> doing various things with pieces of RDF. Which is fine, and saves > >>> us a lot of WG effort, but hasnt really advanced the state of the > >>> art very far, and may not really be living up to our charter. > >> > >> My view has always been that we define a normative semantics for > >> RDF Datasets, and I proposed one more than a year ago. It's fairly > >> simple: you just apply the RDF semantics to each graph separately > >> and what you get is an entailed dataset. It's nothing special or > >> strange > > > > Well, it is very strange, by some lights. It is wildly out of line > > with the intuitions and assumptions underlying the 2004 > > specifications (what I called the 'globalist' perspective on IRI > > meanings.) And it raises an immediate puzzle, which is WHY an RDF > > graph should suddenly be allowed to change its meaning when it is > > embedded inside a dataset and given a name. That seemed extremely > > puzzling to me, I have to say. > > I don't see where the change of meaning happen. If I have the following > RDF graph: > > :c rdfs:subClassOf :d . > :x rdf:type :c . > > it entails: > > :x rdf:type :d . > > If I put this graph in a dataset: > > :d { > :c rdfs:subClassOf :d . > :x rdf:type :c . > } > > it entails: > > :d { > :x rdf:type :d . > } This statement already shows how differently we are thinking of this. I don't think putting a graph into a dataset in any way affects the graph or changes its properties. If G1 entails G2, it doesn't matter what else we know or say about G1 or G2 -- G1 always entails G2. When you write down a dataset, as you did twice there in TriG, you are making a statement. When you said: :d { :c rdfs:subClassOf :d . :x rdf:type :c . } you were saying, in my proposed reading: ":d is something which contains the triples {:c rdfs:subClassOf :d. :x rdf:type :c.}. When you said: :d { :x rdf:type :d . } you were saying, in my proposed reading: ":d is something which contains the triple {:x rdf:type :d}". So, yes, of course the first set of triples entails the second set of triples, but the statement ":d is something which contains {first bunch of triples}" does not entail the statement ":d is something which contains {second bunch of triples}". -- Sandro > And all other entailments are preserved. They are simply put "in > context", so to speak. > > >> or hard to get accepted: it's already implemented in some triple > >> stores. Yes, it may be little in advancing the state of the art, > >> but it gives a good ground to define notions such as imports, > >> temporal reasoning, trust-based reasoning and various other > >> things. It's perfectly in line with what we have to do according to > >> our charter. > >> > > > > I agree it is quite precise and quite simple. However, it > > conspicuously fails to do what seems to me to be part of our charter > > here, which is to make the notion of named graph precise and give a > > semantics for it. > > Tell me what is imprecise and I'll fix it. I claim that it is > sufficiently precise to be implemented and tested against test cases, > and I even think that it is already implemented in some triple stores. > What is missing in my proposal, IMO, is to clearly define the semantic > extensions that would allow one to constrain the graph "names" to denote > the graph, that would allow one to "import/inherit" another "named" > graph, and possibly other extensions. > > I know it takes what SPARQL calls a "named graph" > > and gives a semantics for that, but it does so by refusing to treat > > the "name" as a name of the "graph". Again, even that is only a > > terminological matter, which we could treat as being unfortunate but > > not fatal; but if people also wish to use those graph "names" to > > refer to the actual graphs, as some people apparently do want to do, > > and I suspect many peple outside the WG will assume that they can > > freely do, simply from the fact that they are called "name", then > > this lack of real naming becoimes a genuine semantic problem. Which > > is why I like Sandro's suggested interpretation of datasets, which > > provides for the naming relationship, and suggested introducing your > > contextual-variation-of-meaning idea by a different mechanism built > > into RDF. If you or someone else can come up with an alternative way > > to attach names to graphs, I'd be delighted. So far, nobody has, > > AFAIK. > > If I undeerstood well Sandro's suggested interpretation, he would prefer > that the following TriG file: > > :d { > :c rdfs:subClassOf :d . > :x rdf:type :c . > } > > does *not* entail: > > :d { > :x rdf:type :d . > } > > So, a graph in a "named" graph pair does not have the semantics of an > RDF graph outside it. If such is indeed what Sandro suggest, then I can > use your own argument against it: WHY an RDF graph should suddenly be > allowed to change its meaning when it is embedded inside a dataset and > given a name. *That* seemed extremely puzzling to me. > > Now, concerning graph "names" denoting the graph itself, I'd propose the > following: > > Call the Dataset semantics I proposed the "Simple Dataset semantics" > (name chosen to mirror Simple entailment in the RDF spec). > In Simple entailment, predicates are not required to be instances of > rdf:Property. But there is a semantic constraint provided by the RDF > semantics which impose it to be. > Similarly, there can be a semantic constraint in "RDF Dataset semantics" > (an extension of Simple Dataset semantics") which says that graph > "names" must be interpreted as RDF graphs. > This can be formalised in different ways depending of what we want to > do. For instance, we can impose that the graph IRI denote exactly the > graph between the curly brackets. Or that it denote a superset of the > graph. Or that the graph IRI denotes the graph only in the default > graph, but inside a named graph, it is not required to denote anything > in particular. But whatever the choice taken there, these can be simply > described as semantic extensions of the Simple Dataset semantics. > > > >> The way things are going on in this WG tends to suggest that there > >> will not be any formal semantics for RDF Datasets as there are too > >> much disagreement on what it should be. I have the impression that > >> it is the only viable, but disappointing alternative. > > > > I dont think we should give up yet. So far, in my experience, this WG > > is no more internally fractious than other WGs I have been on. It > > took the first RDF WG nine months to decide how to write the number > > three, and the ISO group which made common logic went on for four > > years without agreeing whether the logic was typed or untyped. > > I'm rather confident that these discussions can lead eventually to > consensus, but I am a bit afraid of how much time this will take. There > is a strong risk that it will take more time than what was initially > allocated to the WG. I don't know what's W3C policy wrt extending the > duration of WGs. > > >>>> > >>>> In my opinion, if one just want to quote a graph and talk about > >>>> it, one just needs RDF triples. > >>> > >>> No, that won't do. At the very least we need reification or some > >>> kind of graph literal construction. > >> > >> Not necessarily. RDF does not define a formal semantics for > >> information about persons, yet it is perfectly possible to talk > >> about people with RDF. > > > > Sigh. You keep saying this and it keeps missing the point. In the > > case of graph naming, unlike that of person naming, there are > > entailments that depend upon the name-graph naming relationship being > > rigid. For example, you really do want the metadata to apply to the > > actual graph (or graph container, whatever we decide) being named by > > the name. I don't think that a 'social consensus' is good enough > > here. But more to the point, with your dataset convention, there are > > clear use cases where the graph "name" most assuredly does not denote > > the graph (since it is being used to denote something else entirely), > > so no amount of social consensus is going to make that work and still > > be in conformity to the 2004 RDF specs. (Part of the idea behind the > > 'contexts' design is to keep the association of IRIs to contexts (or > > extensions) separate from what they denote, precisly in order to > > allow this kind of usage.) > > Clearly, if you want to do complex reasoning over graphs and check > consistency of metadata etc, you'll need some way to make clear how > names are related and so on. But it seems to me that the cost it adds, > in terms of expressiveness and constraints, is not worth the benefits > and commonly accepted best practices are able to solve a huge part of > the use cases. > RDF has the advantage of being very much unconstrained so that it fits > many scenarios easily. But the unconstrainedness is a problem in many > cases too, that is why we have all these extensions like RDFS, OWL, > SWRL, etc. that add their own constrains to solve complex use cases. > I think we can do the same for datasets. Have a very unconstrained base > and propose a few extensions that match the most common use cases. > In addition to this, we could provide a mechanism to "announce" which > extensions are used (probably what you have in mind with your > "extension" proposal). > > > > >> It just requires a social consensus such as FOAF. The same can > >> happen for talking about graphs. Of course, if you need to do some > >> stricter reasoning, you would need something more, like e.g. graph > >> literals but I haven't yet found a convincing use case that would > >> require it. > >> > >>>> > >>>> <g> a :Graph ; dc:creator<me> ; :saysInTurtle ":s :p > >>>> :o" . > >>> > >>> Is ":s :p :o" a string? > >> > >> Yes. > >> > >>> > >>>> > >>>> You can even have a "partial semantics" by separating the > >>>> triples: > >>>> > >>>> <g> :saysInTurtle ":s :p :o", ":a :b :c" . > >>>> > >>>> Then it's just a matter of social consensus that :saysInTurtle > >>>> is used to relate an RDF graph to a Turtle serialisation of > >>>> that graph. You could also add something to the formal > >>>> semantics, but on the one hand it would create headachs to all > >>>> implementers (imposing something to be interpreted as an RDF > >>>> Graph is much more troublesome than implementing > >>>> rdf:XMLLiteral, for instance), and on the other hand, I can't > >>>> think of any concrete real life situation where it's actually > >>>> useful. > >>> > >>> I can. If someone wants to get ambitious with their library and > >>> use some OWL reasoning (as for example the BBC are doing, for > >>> one) then you really do want to have some connection with the OWL > >>> content at the level of model theory, if only to clarify what > >>> owl:sameAs is supposed to mean. > >> > >> This is not a concrete example. Can you show a real life problem > >> that *requires* that a URI is interpreted as an RDF graph to be > >> solved conveniently? > > > > How about using owl:sameAs on IRIs intended to denote graphs? Or > > between an IRI and a blank node both intended to denote a graph, as > > in some of Sandro's examples. Or suppose you have classes of graphs, > > and want to define an OWL restriction class, for example the class of > > all graphs containing program information whose associated date of > > creation is earlier than 01012010. If graph "names" don't really > > refer, none of this really makes sense. > > But what's the real life problem you're trying to solve here? What are > the data and what useful conclusions you would draw from the fact that > the name denotes the graph, which you would not be able to draw > otherwise? I'll try to extend your example to see if I can get something. > > Consider the example: > > <joe> <says> <g> . > <g> owl:sameAs <h> . > <g> { > <joe> a foaf:Person . > } > <h> { > foaf/person rdfs:subClassOf foaf:Agent . > } > > what can we conclude? It all depends how we interpret the named graphs. > > *Case 1.* > If <g> is interpreted exactly as the graph inside the curly brackets, > then we have an inconsistency. Can this be considered a useful > conclusion in such a scenario? I don't know but I find that enforcing > the graph IRI to denote exactly the graph is a much too strong and would > not be convenient for many use cases (e.g., facts evolving with time). > > *Case 2.* > If <g> is interpreted as a supergraph of what's in the brackets, then > we can conclude: > > <joe> <says> <g> . > <g> owl:sameAs <h> . > <g> { > <joe> a foaf:Person . > foaf/person rdfs:subClassOf foaf:Agent . > } > <h> { > <joe> a foaf:Person . > foaf/person rdfs:subClassOf foaf:Agent . > } > > This already looks much more helpful. This probably fits Sandro's > endorsement use case as it looks to me it's his suggested semantics. > > But still I find it unsatisfying when it comes to dealing with Graph > having different provenance, from which you would like to conclude > things such that: > > *Case 3.* > In this case, the datasets should be read "from source <g>, I know that > Joe is a person, from source <h>, I know that persons are agents, but I > also know that source <g> and <h> are actually one source. So I can > conclude that, according to source <g> (or <h>), Joe is an agent. > > <joe> <says> <g> . > <g> owl:sameAs <h> . > <g> { > <joe> a foaf:Person . > foaf/person rdfs:subClassOf foaf:Agent . > <joe> a foaf:Agent . > } > <h> { > <joe> a foaf:Person . > foaf/person rdfs:subClassOf foaf:Agent . > <joe> a foaf:Agent . > } > > So in the end, case 3 leads to my proposal. > Hmmm, looking at this and remembering what Ivan said a couple of times > "we have to acknowledge that there is no fit-for-all semantics", maybe > we can have two competing semantics, but there should be a way to > declare which one is assumed when exchanging a TriG file. > > > [skip] > >
Received on Wednesday, 2 May 2012 13:56:12 UTC