- From: Antoine Zimmermann <antoine.zimmermann@emse.fr>
- Date: Wed, 22 Aug 2012 15:10:02 +0200
- To: public-rdf-wg@w3.org, Pat Hayes <phayes@ihmc.us>
What's interesting here is that, by adding a constraint to the notion of interpretation of [1], there seems to be less entailments than without the constraint. I imagine it is because the constraint is imposing a relation between something semantic and something that is in the syntax (it imposes that a URI be interpreted as a component of a dataset). It could also be because the dataset interpretations of [1] are relying on multiple RDF interpretations. This is weird and I'd be interested to hear Pat on the subject. This may be a reason why we do not want to have the RDF graphs (which syntatic things) themselves in the universe of interpretation (the semantic things). ...hmm. I don't know if I like it that much anymore. AZ Le 22/08/2012 14:58, Antoine Zimmermann a écrit : > What I do not like in the arguments is the hypothetical "if". Yes, of > course, if we can extend a minimal semantics to any other form of > semantics by mere additional semantic conditions, then yes why not? > > But I pretend that you are not going to be able to do this from the > quote-semantics to the dataset semantics of [1]. > > Would it be ok if we could define the quote-semantics as a semantic > extension of the semantics of [1]? > > Anyway, there is no need for an hypothetical "if": I just did it: > > http://www.w3.org/2011/rdf-wg/wiki/TF-Graphs/Dataset-semantics > > This semantics extension of [1] gives the same entailments as what's in > the RDF Graph Identification proposal. I you don't trust me, I'll > provide a formal proof. (Or someone provides a counter example). > > > So, to summarise, the proposal in [1]: > - is extensible with proper semantic conditions to all kinds of other > semantics; > - with little semantic extension, can cover all the use cases of the > quote-semantics; > - covers in addition all the use case related to reasoning with multiple > graphs (temporal, multi-source, etc); > - is very much in line with the SPARQL model, based on entailment > regimes at the graph level, just like SPARQL. > > > Then I'd like to know what's wrong with this proposal? > > > --AZ > > > Le 22/08/2012 12:06, Ivan Herman a écrit : >> Antoine, >> >> let me try to understand what you propose, because there are >> different ways to interpret your mail. Is it: >> >> 1. RDF 1.1 should be completely silent on any semantics w.r.t. >> datasets, or >> >> 2. RDF 1.1 should adopt [1] as the semantics w.r.t. datasets instead >> of the 'quoting' semantics as the kind of 'base-line' semantics >> >> >> As for #2: I do not have any fundamental issue with it, technically. >> However, the proposal was first announced in March '11 >> >> http://lists.w3.org/Archives/Public/public-rdf-wg/2011Mar/0277.html >> >> followed by a discussion thread; then it continued in a further >> discussion in a thread started by >> >> http://lists.w3.org/Archives/Public/public-rdf-wg/2011Apr/0116.html >> >> finally, there were some revival in >> >> http://lists.w3.org/Archives/Public/public-rdf-wg/2011Aug/0105.html >> >> I am probably missing some other threads, but the fact remains that >> the WG could never get a consensus around [1]. _I am not interested >> to know why_, by the way; let us say it is part of a collective >> failure of the group. >> >> *If* the WG can get to a consensus around that semantics as a base >> line now, I am personally fine with it (I do understand the arguments >> against the quote semantics). The feeling among ourselves, when we >> put together the document, was that the quote semantics is pretty >> much the bare minimum that the WG nay get a consensus on and, if we >> define some sort of an extension mechanism, others like the one in >> [1] can also be expressed. >> >> Of course, we can go the #1 line. I would prefer not, and find a >> minimum, but I will not lie down the road if that is what we will end >> up with... >> >> Ivan >> >> [1] >> http://www.w3.org/2011/rdf-wg/wiki/TF-Graphs/RDF-Datasets-Proposal#Semantics >> >> >> >> >> >> On Aug 22, 2012, at 10:28 , Antoine Zimmermann wrote: >> >>> Sandro, all, >>> >>> >>> Sorry again to write very very long emails. I've put tremendous >>> amount of thinking in this email, so it's really hard to make it >>> short and summarise all of it. I'm very sorry to say that I'm >>> leaning very much towards *not* adopting a formal semantics in the >>> line of the RDF Graph Identification proposal suggests. I can try a >>> summary: - what conclusion can we draw from a<name,graph> pair? In >>> the G.I. proposal, essentially none; - we do not need >>> quote-semantics if we want a faithful retranscription of an >>> existing graph (e.g., the crawl use case); - the quote-semantics, >>> as proposed, does not match the notion of quoting in natural >>> language; - all of SPARQL is based on applying an entailment regime >>> to all the graphs in a target datasets, be they named or default; - >>> SPARQL ASK on basic graph patterns and GRAPH graph patterns matches >>> very precisely the semantics of dataset that I proposed. Please >>> read on for detailed explanations on these items. >>> >>> >>> First, let me summarise the things on which we seem to agree: >>> >>> 1. considering all the discussions on use cases, existing >>> implementations, SPARQL specs, etc we agree that imposing that the >>> graph IRI denotes the graph itself is too strong; 2. we want a >>> minimal semantics, as little constrained as possible, such that >>> alternative semantics can be defined (by this group or another) as >>> extensions of it by adding more constrains. 3. a dataset with no >>> named graphs "behaves" as if it was a normal RDF graph (in >>> mathematical terms, we can say that there is an injective morphism >>> from RDF Graphs to RDF Datasets, which means we can assimilate an >>> RDF Graph to a corresponding RDF Dataset with no named graphs). >>> >>> >>> Let us imagine we only do that, proposing a minimal semantics that >>> fulfill the 3 items. Formally, one possible proposal could be the >>> following: >>> >>> A simple-dataset-interpretation (or an >>> rdf/rdfs/d/owl-dataset-interpretation) wrt vocabulary V is a >>> simple-interpretation (or an rdf/rdfs/d/owl-interpretation) wrt to >>> vocabulary V \union {rdf:hasGraph} such that: >>> >>> - if a dataset D includes a default graph G, then I(G) = false >>> implies I(D) = false; - if a dataset D includes a named graph<n,G>, >>> then G in IR (i.e., in the set of resources of interpretation I), n >>> is in vocabulary V, and<I(n),G> belongs to IEXT(I(rdf:hasGraph)) - >>> in any other case, I(D) is false for a dataset D. >>> >>> >>> The problem is, without further restrictions, this leads to a >>> semantics of "no-semantics" for named graphs. We are not allowed to >>> draw any conclusion from a<name,graph> pair. We end up >>> formalising, as a model theoretic semantics, the notion of "no >>> semantics". >>> >>> Let me explain this by reducing the case to the RDF semantics. We >>> all agree that RDF talks about resources, that literals are a >>> special case of resources, that URIs denote resources and there >>> exist relationships between resources. But we are not all agreeing >>> to make entailments on RDF data because there are times when we >>> want to faithfully transmit an RDF graph exactly as it was >>> produced. >>> >>> So we formalise the "semantics of no-semantics" of RDF like this: a >>> no-interpretation is a tuple (IR,IP,LV,IS,IL,IEXT) such that: - IR >>> is a set of resources, - IP is ..., etc... (see RDF Semantics) >>> >>> denotation of graphs: - for an RDF graph G, I(G) is true iff G is >>> in IR. >>> >>> this is a semantics where graphs do not entail anything, except >>> themselves. All the semantics in RDF Semantics 2004 can be derived >>> from this by adding more constraints. So we are happy as we have >>> the core semantics from which everything else derives. >>> >>> >>> BUT this is absurd! You don't need to define a semantics of >>> no-semantics. If you need to keep the original triples, you simply >>> do not apply the semantics, or at least not to the data you must >>> share. If you want to transmit a faithful representation of graph, >>> just do it! It's legal. It'd done all the time. It does not prevent >>> anyone, including the one who share a faithful copy of an existing >>> graph, to draw conclusions from the graph. >>> >>> That is what a crawler does: it meets normal RDF graphs in the wild >>> and faithfully transcribes them into named graphs, even though, as >>> they are RDF Graphs, they have a normative semantics. The semantics >>> does not have any effect on graphs. A formal semantics does >>> *nothing*. It does not put conclusions in people's mouth. >>> >>> A semantics tells you what you are *allowed* to conclude. It does >>> not tell you either what to do with these conclusions, nor what you >>> are *forced* to conclude. And frankly, I would really like to be >>> allowed to conclude, even without further information, that<g> >>> {<s> <p> [] } holds whenever<g> {<s> <p> <o> } holds. I >>> think, after all, that there's hardly one, if any at all, use case >>> which requires that it is not allowed to draw this conclusion. >>> >>> >>> Take this other angle: assume we have a Web crawler or application >>> that fetches RDF documents online. It looks up >>> http://example.com/stuff.rdf and gets an RDF graph. Distinguish 2 >>> possibilities: 1. It puts the RDF graph into a<name,graph> pair. >>> It ends up with, for instance: >>> >>> ex:stuff.rdf {<s> <p> <o> .} >>> >>> Given the quote-semantics, it is not allowed to draw the following >>> conclusion, unless some extra information comes: >>> >>> ex:stuff.rdf {<s> <p> <o> .<p> a rdf:Property .} >>> >>> 2. It applies operations on the RDF graph to build the RDF-closure >>> of the RDF graph, that is, it simply draws conclusion from the >>> graph. It then injects the closure into a<name,graph> pair and >>> ends up with: >>> >>> ex:stuff.rdf {<s> <p> <o> .<p> a rdf:Property .} >>> >>> This is all legal, semantically valid operations. The final named >>> graph is obtained from the two elements "ex:stuff.rdf" and "{<s> >>> <p> <o>}" by drawing conclusion in RDF and keeping the IRI to >>> index it. >>> >>> So, the construction would be valid and directly following >>> logically from the given graph and its IRI, but the<name,graph> >>> pair would not carry the conclusion nonetheless. What kind of >>> semantics is that? >>> >>> >>> >>> Another point is that SPARQL relies on an entailment regime (simple >>> entailment only for SPARQL 1.0), which it uses on all of the graphs >>> interrogated in a dataset. There is no special treatments of graphs >>> inside<name,graph> pairs. >>> >>> So: >>> >>> ASK WHERE { GRAPH<g> {<s> <p> [] } } >>> >>> answers yes iff the dataset: >>> >>> <g> {<s> <p> [] } >>> >>> is entailed by the target dataset according to the semantics of [1] >>> (which is (c) in my previous email). However, this answer has no >>> relationship with the quoting semantics, except if, by chance, the >>> graph named<g> happens to be exactly the triple "<s> <p> []". >>> >>> >>> [1] Semantics, in TF-Graphs/RDF-Datasets-Proposal. >>> http://www.w3.org/2011/rdf-wg/wiki/TF-Graphs/RDF-Datasets-Proposal#Semantics >>> >>> >>> >>> >>> > Le 20/08/2012 19:11, Sandro Hawke a écrit : >>>> On 08/20/2012 10:02 AM, Antoine Zimmermann wrote: >>>> >>>> I believe it's possible to handle the use cases that want (a) and >>>> (c) by standardizing on (b) and then defining additional RDF >>>> vocabulary terms (either now or later). >>> >>> I don't know how you can go from (b) to (c) or from (b) to (a). I >>> have not yet seen a fully stabilised version of (b), but the ones >>> that have been sketched do not make it easy to do so. However, >>> there is a stable and complette version of (c) and I can tell you >>> here how you can go from (c) to (a). It suffices to add the >>> following semantic condition to the proposal of [1]: >>> >>> - for all names n1, n2 in the vocabulary V, Con(n1) = Con(n2). >>> >>> [1] Semantics, in TF-Graphs/RDF-Datasets-Proposal. >>> http://www.w3.org/2011/rdf-wg/wiki/TF-Graphs/RDF-Datasets-Proposal#Semantics >>> >>> >>> >>> > And if one wants to quote graphs, maybe they should use double quotes: >>> >>> <g> ex:hasGraph "<s> <p> <o>"^^ex:Graph . >>> >>> which is valid and consistent RDF. This has exactly the semantics >>> of "no-semantics" described above. >>> >>> BTW, the action of quoting in natural language does not reduce the >>> possible inferences, it increases them. Compare: >>> >>> - Joe said the war is over. - Joe said "the war is over". >>> >>> In both cases, I can infer that Joe told that the war has come to >>> and end. But in the second case, I know in addition that Joe used >>> the word "over". So, if we really want to simulate quotes, then it >>> should be a more expressive semantics rather than a weaker. So >>> maybe we can define (b) in function of (c) rather than the >>> opposite. >>> >>> >>>> (As an aside: I don't think the priorities have any formal >>>> weight. The WG has never resolved to accept or reject or >>>> prioritize any uses as more important than any other.) >>> >>> Yep, no formal weight but the priorities are showing which use >>> cases are more important than others, in the view of people from >>> this working group. That's enough to take a serious look at the >>> highest priority. >>> >>> >>>>> Also, the condition ∀i: I(ui) = Gi is problematic. At first, it >>>>> seems to be natural to say that the graph IRI RDF-denotes the >>>>> graph. But: >>>>> >>>>> http://www.w3.org/2011/rdf-wg/meeting/2011-04-14#resolution_1 >>>>> >>>>> "RESOLVED: Named Graphs in SPARQL associate IRIs and graphs >>>>> *but* they do not necessarily "name" graphs in the strict >>>>> model-theoretic sense. A SPARQL Dataset does not establish >>>>> graphs as referents of IRIs (relevant to ISSUE-30)". >>>>> >>>>> I know this resolution is about SPARQL datasets, and it's not >>>>> necessarily applying to whatever structure we come up with in >>>>> RDF, but one of the Priority A use cases is to be able to dump >>>>> a SPARQL store. With this resolution, there is apparently a >>>>> clash between the use case requirement and the semantic >>>>> condition. >>>>> >>>> >>>> I agree. I'm pretty sure ∀i: I(ui) = Gi is wrong. Most of the >>>> time, in practice, Ui denotes a g-box, not a g-snap. (Or, >>>> sometimes, it's something else associated with a g-box, like the >>>> primary subject.) I don't see how SPARQL 1.1 UPDATE with the >>>> GRAPH keyword makes any sense if Ui denotes Gi. >>> >>> The GRAPH keyword has its own semantics defined by SPARQL. It does >>> not relate to the RDF semantics. The GRAPH keyword is just an >>> indication that we want to work with the RDF graph inside a >>> certain<name,graph> pair. It is totally independent of what the >>> URI denotes in RDF semantics. >>> >>> >>>>> >>>>> My proposal is to define several recommended semantics and >>>>> allow the concrete syntax to declare in a document what >>>>> semantics is assumed when exchanging a dataset. >>>>> >>>>> I find this idea appealing because it is in line with the fact >>>>> that information carried by HTTP is accompanied by a self >>>>> description of how it should be understood. For instance, we >>>>> have MIME types, we have <!DOCTYPE> declarations, etc. Since >>>>> RDF is not a purely syntactical datastructure, it makes sense >>>>> that it carries with it a reference to the semantics it uses. >>>>> Such practices of referencing the MIME type, charset, doctype, >>>>> schema, etc have been a key enabler of interoperability on the >>>>> Web. Why not extend the pattern to the formal semantics? BTW, >>>>> SPARQL services have a way to tell what inferrence regime they >>>>> support, and SPARQL queries have a way to ask for a particular >>>>> regime. I pretend that my proposal is simply in agreement with >>>>> already accepted notions in the SPARQL world. >>>>> >>>> >>>> I see the appeal -- solving each kind of problem with an >>>> approach crafted directly for it -- but my sense is this would >>>> cause too much confusion in the market and result a lack of >>>> interoperability. I think we're better off standardizing (b) now, >>>> as long as I'm right that we can address the (a) and (c) use >>>> cases using just additional vocabulary. >>> >>> I'm pretty sure you cannot get from (b) to (c) with merely >>> additional vocabulary. Not in the way the semantics of (b) have be >>> tentatively defined so far. You'd really need extra stuff in the >>> structure of an interpretation. >>> >>> >>>> >>>> -- Sandro >>>> >>>>> >>>>> Best, >>>> >>>> >>>> >>> >>> -- Antoine Zimmermann ISCOD / LSTI - Institut Henri Fayol École >>> Nationale Supérieure des Mines de Saint-Étienne 158 cours Fauriel >>> 42023 Saint-Étienne Cedex 2 France Tél:+33(0)4 77 42 66 03 >>> Fax:+33(0)4 77 42 66 66 http://zimmer.aprilfoolsreview.com/ >>> >> >> >> ---- Ivan Herman, W3C Semantic Web Activity Lead Home: >> http://www.w3.org/People/Ivan/ mobile: +31-641044153 FOAF: >> http://www.ivan-herman.net/foaf.rdf >> >> >> >> >> >> >> > -- Antoine Zimmermann ISCOD / LSTI - Institut Henri Fayol École Nationale Supérieure des Mines de Saint-Étienne 158 cours Fauriel 42023 Saint-Étienne Cedex 2 France Tél:+33(0)4 77 42 66 03 Fax:+33(0)4 77 42 66 66 http://zimmer.aprilfoolsreview.com/
Received on Wednesday, 22 August 2012 13:10:32 UTC