- From: Antoine Zimmermann <antoine.zimmermann@emse.fr>
- Date: Wed, 22 Aug 2012 14:58:50 +0200
- To: public-rdf-wg@w3.org
What I do not like in the arguments is the hypothetical "if". Yes, of course, if we can extend a minimal semantics to any other form of semantics by mere additional semantic conditions, then yes why not? But I pretend that you are not going to be able to do this from the quote-semantics to the dataset semantics of [1]. Would it be ok if we could define the quote-semantics as a semantic extension of the semantics of [1]? Anyway, there is no need for an hypothetical "if": I just did it: http://www.w3.org/2011/rdf-wg/wiki/TF-Graphs/Dataset-semantics This semantics extension of [1] gives the same entailments as what's in the RDF Graph Identification proposal. I you don't trust me, I'll provide a formal proof. (Or someone provides a counter example). So, to summarise, the proposal in [1]: - is extensible with proper semantic conditions to all kinds of other semantics; - with little semantic extension, can cover all the use cases of the quote-semantics; - covers in addition all the use case related to reasoning with multiple graphs (temporal, multi-source, etc); - is very much in line with the SPARQL model, based on entailment regimes at the graph level, just like SPARQL. Then I'd like to know what's wrong with this proposal? --AZ Le 22/08/2012 12:06, Ivan Herman a écrit : > Antoine, > > let me try to understand what you propose, because there are > different ways to interpret your mail. Is it: > > 1. RDF 1.1 should be completely silent on any semantics w.r.t. > datasets, or > > 2. RDF 1.1 should adopt [1] as the semantics w.r.t. datasets instead > of the 'quoting' semantics as the kind of 'base-line' semantics > > > As for #2: I do not have any fundamental issue with it, technically. > However, the proposal was first announced in March '11 > > http://lists.w3.org/Archives/Public/public-rdf-wg/2011Mar/0277.html > > followed by a discussion thread; then it continued in a further > discussion in a thread started by > > http://lists.w3.org/Archives/Public/public-rdf-wg/2011Apr/0116.html > > finally, there were some revival in > > http://lists.w3.org/Archives/Public/public-rdf-wg/2011Aug/0105.html > > I am probably missing some other threads, but the fact remains that > the WG could never get a consensus around [1]. _I am not interested > to know why_, by the way; let us say it is part of a collective > failure of the group. > > *If* the WG can get to a consensus around that semantics as a base > line now, I am personally fine with it (I do understand the arguments > against the quote semantics). The feeling among ourselves, when we > put together the document, was that the quote semantics is pretty > much the bare minimum that the WG nay get a consensus on and, if we > define some sort of an extension mechanism, others like the one in > [1] can also be expressed. > > Of course, we can go the #1 line. I would prefer not, and find a > minimum, but I will not lie down the road if that is what we will end > up with... > > Ivan > > [1] > http://www.w3.org/2011/rdf-wg/wiki/TF-Graphs/RDF-Datasets-Proposal#Semantics > > > > > On Aug 22, 2012, at 10:28 , Antoine Zimmermann wrote: > >> Sandro, all, >> >> >> Sorry again to write very very long emails. I've put tremendous >> amount of thinking in this email, so it's really hard to make it >> short and summarise all of it. I'm very sorry to say that I'm >> leaning very much towards *not* adopting a formal semantics in the >> line of the RDF Graph Identification proposal suggests. I can try a >> summary: - what conclusion can we draw from a<name,graph> pair? In >> the G.I. proposal, essentially none; - we do not need >> quote-semantics if we want a faithful retranscription of an >> existing graph (e.g., the crawl use case); - the quote-semantics, >> as proposed, does not match the notion of quoting in natural >> language; - all of SPARQL is based on applying an entailment regime >> to all the graphs in a target datasets, be they named or default; - >> SPARQL ASK on basic graph patterns and GRAPH graph patterns matches >> very precisely the semantics of dataset that I proposed. Please >> read on for detailed explanations on these items. >> >> >> First, let me summarise the things on which we seem to agree: >> >> 1. considering all the discussions on use cases, existing >> implementations, SPARQL specs, etc we agree that imposing that the >> graph IRI denotes the graph itself is too strong; 2. we want a >> minimal semantics, as little constrained as possible, such that >> alternative semantics can be defined (by this group or another) as >> extensions of it by adding more constrains. 3. a dataset with no >> named graphs "behaves" as if it was a normal RDF graph (in >> mathematical terms, we can say that there is an injective morphism >> from RDF Graphs to RDF Datasets, which means we can assimilate an >> RDF Graph to a corresponding RDF Dataset with no named graphs). >> >> >> Let us imagine we only do that, proposing a minimal semantics that >> fulfill the 3 items. Formally, one possible proposal could be the >> following: >> >> A simple-dataset-interpretation (or an >> rdf/rdfs/d/owl-dataset-interpretation) wrt vocabulary V is a >> simple-interpretation (or an rdf/rdfs/d/owl-interpretation) wrt to >> vocabulary V \union {rdf:hasGraph} such that: >> >> - if a dataset D includes a default graph G, then I(G) = false >> implies I(D) = false; - if a dataset D includes a named graph<n,G>, >> then G in IR (i.e., in the set of resources of interpretation I), n >> is in vocabulary V, and<I(n),G> belongs to IEXT(I(rdf:hasGraph)) - >> in any other case, I(D) is false for a dataset D. >> >> >> The problem is, without further restrictions, this leads to a >> semantics of "no-semantics" for named graphs. We are not allowed to >> draw any conclusion from a<name,graph> pair. We end up >> formalising, as a model theoretic semantics, the notion of "no >> semantics". >> >> Let me explain this by reducing the case to the RDF semantics. We >> all agree that RDF talks about resources, that literals are a >> special case of resources, that URIs denote resources and there >> exist relationships between resources. But we are not all agreeing >> to make entailments on RDF data because there are times when we >> want to faithfully transmit an RDF graph exactly as it was >> produced. >> >> So we formalise the "semantics of no-semantics" of RDF like this: a >> no-interpretation is a tuple (IR,IP,LV,IS,IL,IEXT) such that: - IR >> is a set of resources, - IP is ..., etc... (see RDF Semantics) >> >> denotation of graphs: - for an RDF graph G, I(G) is true iff G is >> in IR. >> >> this is a semantics where graphs do not entail anything, except >> themselves. All the semantics in RDF Semantics 2004 can be derived >> from this by adding more constraints. So we are happy as we have >> the core semantics from which everything else derives. >> >> >> BUT this is absurd! You don't need to define a semantics of >> no-semantics. If you need to keep the original triples, you simply >> do not apply the semantics, or at least not to the data you must >> share. If you want to transmit a faithful representation of graph, >> just do it! It's legal. It'd done all the time. It does not prevent >> anyone, including the one who share a faithful copy of an existing >> graph, to draw conclusions from the graph. >> >> That is what a crawler does: it meets normal RDF graphs in the wild >> and faithfully transcribes them into named graphs, even though, as >> they are RDF Graphs, they have a normative semantics. The semantics >> does not have any effect on graphs. A formal semantics does >> *nothing*. It does not put conclusions in people's mouth. >> >> A semantics tells you what you are *allowed* to conclude. It does >> not tell you either what to do with these conclusions, nor what you >> are *forced* to conclude. And frankly, I would really like to be >> allowed to conclude, even without further information, that<g> >> {<s> <p> [] } holds whenever<g> {<s> <p> <o> } holds. I >> think, after all, that there's hardly one, if any at all, use case >> which requires that it is not allowed to draw this conclusion. >> >> >> Take this other angle: assume we have a Web crawler or application >> that fetches RDF documents online. It looks up >> http://example.com/stuff.rdf and gets an RDF graph. Distinguish 2 >> possibilities: 1. It puts the RDF graph into a<name,graph> pair. >> It ends up with, for instance: >> >> ex:stuff.rdf {<s> <p> <o> .} >> >> Given the quote-semantics, it is not allowed to draw the following >> conclusion, unless some extra information comes: >> >> ex:stuff.rdf {<s> <p> <o> .<p> a rdf:Property .} >> >> 2. It applies operations on the RDF graph to build the RDF-closure >> of the RDF graph, that is, it simply draws conclusion from the >> graph. It then injects the closure into a<name,graph> pair and >> ends up with: >> >> ex:stuff.rdf {<s> <p> <o> .<p> a rdf:Property .} >> >> This is all legal, semantically valid operations. The final named >> graph is obtained from the two elements "ex:stuff.rdf" and "{<s> >> <p> <o>}" by drawing conclusion in RDF and keeping the IRI to >> index it. >> >> So, the construction would be valid and directly following >> logically from the given graph and its IRI, but the<name,graph> >> pair would not carry the conclusion nonetheless. What kind of >> semantics is that? >> >> >> >> Another point is that SPARQL relies on an entailment regime (simple >> entailment only for SPARQL 1.0), which it uses on all of the graphs >> interrogated in a dataset. There is no special treatments of graphs >> inside<name,graph> pairs. >> >> So: >> >> ASK WHERE { GRAPH<g> {<s> <p> [] } } >> >> answers yes iff the dataset: >> >> <g> {<s> <p> [] } >> >> is entailed by the target dataset according to the semantics of [1] >> (which is (c) in my previous email). However, this answer has no >> relationship with the quoting semantics, except if, by chance, the >> graph named<g> happens to be exactly the triple "<s> <p> []". >> >> >> [1] Semantics, in TF-Graphs/RDF-Datasets-Proposal. >> http://www.w3.org/2011/rdf-wg/wiki/TF-Graphs/RDF-Datasets-Proposal#Semantics >> >> >> >> Le 20/08/2012 19:11, Sandro Hawke a écrit : >>> On 08/20/2012 10:02 AM, Antoine Zimmermann wrote: >>> >>> I believe it's possible to handle the use cases that want (a) and >>> (c) by standardizing on (b) and then defining additional RDF >>> vocabulary terms (either now or later). >> >> I don't know how you can go from (b) to (c) or from (b) to (a). I >> have not yet seen a fully stabilised version of (b), but the ones >> that have been sketched do not make it easy to do so. However, >> there is a stable and complette version of (c) and I can tell you >> here how you can go from (c) to (a). It suffices to add the >> following semantic condition to the proposal of [1]: >> >> - for all names n1, n2 in the vocabulary V, Con(n1) = Con(n2). >> >> [1] Semantics, in TF-Graphs/RDF-Datasets-Proposal. >> http://www.w3.org/2011/rdf-wg/wiki/TF-Graphs/RDF-Datasets-Proposal#Semantics >> >> >> And if one wants to quote graphs, maybe they should use double quotes: >> >> <g> ex:hasGraph "<s> <p> <o>"^^ex:Graph . >> >> which is valid and consistent RDF. This has exactly the semantics >> of "no-semantics" described above. >> >> BTW, the action of quoting in natural language does not reduce the >> possible inferences, it increases them. Compare: >> >> - Joe said the war is over. - Joe said "the war is over". >> >> In both cases, I can infer that Joe told that the war has come to >> and end. But in the second case, I know in addition that Joe used >> the word "over". So, if we really want to simulate quotes, then it >> should be a more expressive semantics rather than a weaker. So >> maybe we can define (b) in function of (c) rather than the >> opposite. >> >> >>> (As an aside: I don't think the priorities have any formal >>> weight. The WG has never resolved to accept or reject or >>> prioritize any uses as more important than any other.) >> >> Yep, no formal weight but the priorities are showing which use >> cases are more important than others, in the view of people from >> this working group. That's enough to take a serious look at the >> highest priority. >> >> >>>> Also, the condition ∀i: I(ui) = Gi is problematic. At first, it >>>> seems to be natural to say that the graph IRI RDF-denotes the >>>> graph. But: >>>> >>>> http://www.w3.org/2011/rdf-wg/meeting/2011-04-14#resolution_1 >>>> >>>> "RESOLVED: Named Graphs in SPARQL associate IRIs and graphs >>>> *but* they do not necessarily "name" graphs in the strict >>>> model-theoretic sense. A SPARQL Dataset does not establish >>>> graphs as referents of IRIs (relevant to ISSUE-30)". >>>> >>>> I know this resolution is about SPARQL datasets, and it's not >>>> necessarily applying to whatever structure we come up with in >>>> RDF, but one of the Priority A use cases is to be able to dump >>>> a SPARQL store. With this resolution, there is apparently a >>>> clash between the use case requirement and the semantic >>>> condition. >>>> >>> >>> I agree. I'm pretty sure ∀i: I(ui) = Gi is wrong. Most of the >>> time, in practice, Ui denotes a g-box, not a g-snap. (Or, >>> sometimes, it's something else associated with a g-box, like the >>> primary subject.) I don't see how SPARQL 1.1 UPDATE with the >>> GRAPH keyword makes any sense if Ui denotes Gi. >> >> The GRAPH keyword has its own semantics defined by SPARQL. It does >> not relate to the RDF semantics. The GRAPH keyword is just an >> indication that we want to work with the RDF graph inside a >> certain<name,graph> pair. It is totally independent of what the >> URI denotes in RDF semantics. >> >> >>>> >>>> My proposal is to define several recommended semantics and >>>> allow the concrete syntax to declare in a document what >>>> semantics is assumed when exchanging a dataset. >>>> >>>> I find this idea appealing because it is in line with the fact >>>> that information carried by HTTP is accompanied by a self >>>> description of how it should be understood. For instance, we >>>> have MIME types, we have <!DOCTYPE> declarations, etc. Since >>>> RDF is not a purely syntactical datastructure, it makes sense >>>> that it carries with it a reference to the semantics it uses. >>>> Such practices of referencing the MIME type, charset, doctype, >>>> schema, etc have been a key enabler of interoperability on the >>>> Web. Why not extend the pattern to the formal semantics? BTW, >>>> SPARQL services have a way to tell what inferrence regime they >>>> support, and SPARQL queries have a way to ask for a particular >>>> regime. I pretend that my proposal is simply in agreement with >>>> already accepted notions in the SPARQL world. >>>> >>> >>> I see the appeal -- solving each kind of problem with an >>> approach crafted directly for it -- but my sense is this would >>> cause too much confusion in the market and result a lack of >>> interoperability. I think we're better off standardizing (b) now, >>> as long as I'm right that we can address the (a) and (c) use >>> cases using just additional vocabulary. >> >> I'm pretty sure you cannot get from (b) to (c) with merely >> additional vocabulary. Not in the way the semantics of (b) have be >> tentatively defined so far. You'd really need extra stuff in the >> structure of an interpretation. >> >> >>> >>> -- Sandro >>> >>>> >>>> Best, >>> >>> >>> >> >> -- Antoine Zimmermann ISCOD / LSTI - Institut Henri Fayol École >> Nationale Supérieure des Mines de Saint-Étienne 158 cours Fauriel >> 42023 Saint-Étienne Cedex 2 France Tél:+33(0)4 77 42 66 03 >> Fax:+33(0)4 77 42 66 66 http://zimmer.aprilfoolsreview.com/ >> > > > ---- Ivan Herman, W3C Semantic Web Activity Lead Home: > http://www.w3.org/People/Ivan/ mobile: +31-641044153 FOAF: > http://www.ivan-herman.net/foaf.rdf > > > > > > > -- Antoine Zimmermann ISCOD / LSTI - Institut Henri Fayol École Nationale Supérieure des Mines de Saint-Étienne 158 cours Fauriel 42023 Saint-Étienne Cedex 2 France Tél:+33(0)4 77 42 66 03 Fax:+33(0)4 77 42 66 66 http://zimmer.aprilfoolsreview.com/ I
Received on Wednesday, 22 August 2012 12:59:19 UTC