- From: Pierre-Antoine Champin <pierre-antoine.champin@liris.cnrs.fr>
- Date: Mon, 19 Nov 2012 19:32:04 +0100
- To: Richard Cyganiak <richard@cyganiak.de>
- Cc: RDF Working Group WG <public-rdf-wg@w3.org>
- Message-ID: <CA+OuRR_mCTNQySWpSV2OjjdJaJgL7gn8NH=YNN5BjGGoxPS5mA@mail.gmail.com>
Richard, all, I like the current proposal in its current state, with a few minor comments: * I would put "scope" in bold face when it is first used (in the definition of b-node) rather than in the following paragraph; especially because it gives the impression at first sight that scopes are defined by documents. The following sentence explains that there may be other kinds of scopes, but stil... * I would rephrase the definition of "fresh" as follows [[In a given scope, a **fresh blank node** is a blank node with a blank node identifier that is new and unique within that scope.]] * In order to keep a nice "definition" style to the whole, I would * At the end of the paragraph about "copying into a scope", I would explicitly state that parsing and serializing are two typical cases of copying a graph into a new scope. * I would move the definition of **merge** inside the note, as it is not so much a definition than a logical consequence of the definition of "copying into a scope" pa On Mon, Nov 19, 2012 at 1:39 PM, Richard Cyganiak <richard@cyganiak.de>wrote: > Hi Antoine, > > Summary: > > 1. Either you don't understand my proposal, or you're wilfully ignoring > parts of it. > > 2. You need to actually read the parts of the Semantics document that I > explicitly pointed out to you in order to show that you are wrong. > > 3. You need to explain how representing different existentially quantified > variables by the same abstract syntax construct isn't a horrible kludge. > > Details inline. > > On 18 Nov 2012, at 06:20, Antoine Zimmermann wrote: > > Le 17/11/2012 17:01, Richard Cyganiak a écrit : > >> Hi Antoine, > >> > >> To be honest, I think your proposal makes the problem worse by > >> deepening the disconnect between abstract syntax and semantics. See, > >> the problem is this. Let's assume we have two Turtle files: > >> > >> _:x :name "Alice". > >> > >> And another one: > >> > >> _:x :name "Bob". > >> > >> They use the same token _:x. But we know that according to the > >> semantics, they don't necessarily label the same thing; both files > >> can be true even if there's nothing in the universe that has both of > >> the names "Alice" and "Bob". > > > > The formal semantics never refers to bnode identifiers. What the token > _:x labels is defined by the Turtle spec. > > > > Let us avoid the concrete RDF syntaxes for now and stick to maths. > > Your two files serialise two graphs but it's not possible to know what > bnodes are serialised. > > That is not true in my proposal. > > [[ > Every RDF document forms its own, self-contained scope for blank nodes. > ]] > > Two documents -- two scopes -- two different blank nodes. > > > It is not known whether b1 = b2 or not. Yet, RDF Concepts says: > > > > "Given two blank nodes, it is possible to determine whether or not they > are the same." > > > > In the current situation, it is not possible to do that. > > Yes it *is* possible, even in RDF 2004; the specs just don't specify how > to do it. The missing bit of specification would have to say: “Within a > scope (that is, within a file or system), blank nodes are the same if they > have the same identifier. Between scopes (that is, between files or > systems), blank nodes are different.” > > My proposal adds that missing bit of spec text. > > > Now, replacing a bnode in a graph by another bnode does not change > anything to what's asserted, so let us replace b1 and b2 by b. Then > consider: > > > > G1' = {(b,:name,"Alice")} > > G2' = {(b,:name,"Bob")} > > > > There you have exactly the same bnode in both graphs. Yet, nothing has > changed. > > “Nothing has changed” is not true. The meaning of the individual graphs > hasn't changed, but the meaning of their union has changed. Before, it > didn't matter where the quantifiers are located. Now you've moved the > quantifiers to just outside each graph. > > > The problem is, in RDF 2004, it is not possible to convey this situation > in any syntaxes. > > How exactly is this a problem? > > > Here comes your design, and mine, into the picture. In your design, the > situation would be: > > > > G1 = {((x,scope1),:name,"Alice")} > > G2 = {((x,scope2),:name,"Bob")} > > > > in the case the nodes are not the same, and: > > > > G1' = {((x,scope),:name,"Alice")} > > G2' = {((x,scope),:name,"Bob")} > > > > in the case the nodes are the same. In my design, instead of being > unable to determine whether the graphs serialised are {G1,G2} or {G1',G2'}, > > [[ > Every RDF document forms its own, self-contained scope for blank nodes. > ]] > > Therefore, the serialised graphs in my proposal are G1 and G2. Different > documents, different scopes. The G1',G2' situation is logically impossible > in my proposal. If you want global scope for your blank nodes, skolemize > them. > > > or even other pairs, it would be known that the Turtle documents > serialise the following graphs: > > > > H1 = {(bx,:name,"Alice")} > > H2 = {(bx,:name,"Bob")} > > > > where bx is *the* bnode with label "x". > > > > I do not see how it can be worse to better know the situation. > > In your proposal, you don't know the situation any better -- you only > think that because you misunderstand what my proposal says. > > Also, this *still* doesn't say explicitly whether the *quantifier* is just > around each graph or global. > > Also, you keep ignoring the big flaw of your proposal -- that things which > are different in the semantics shouldn't be treated as the same in the > abstract syntax. The bx in your H1 and H2 are different existential > variables in the semantics, therefore they should be different blank nodes > in the abstract syntax. > > >> So the Turtle syntax uses the same token to indicate two possibly > >> different things. How do you explain that? > > > > Indeed, that's bad. The token may or may not indicate different bnodes > and no one can know (until the next design). > > Yeah, and that's what my proposal fixes. Your proposal “fixes” that > problem by introducing a new one — different variables being represented > by the same abstract syntax construct. > > >> The 2004 account handwaves around the issue by saying that _:x is > >> just a local label, and leaving the question open whether they label > >> the same or different things in the abstract syntax. So these two > >> files may or may not serialize the same blank node. Then the > >> semantics explains that even if it's the same blank node, if it comes > >> from different places then we need to do a merge, and that creates > >> different blank nodes. > > > > We actually do not need to do a merge anymore than if I have two > integers, say the populations of states, from different places, I do not > need to add them. Also, merge does not "create" different bnodes, anymore > than addition create different integers. > > I have no idea what you are trying to say here. > > > There is no handwaving in my design, and it only refers to the concepts > and abstract syntax, not to "files" > > Your design necessarily inherits the handwaving that RDF 2004 Semantics is > doing on the issue: > > [[ > This effectively treats all blank nodes as having the same meaning as > existentially quantified variables in the RDF graph in which they occur, > and which have the scope of the entire graph. In terms of the N-Triples > syntax, this amounts to the convention that would place the quantifiers > just outside, or at the outer edge of, the N-Triples document corresponding > to the graph. This in turn means that there is a subtle but important > distinction in meaning between the operation of forming the union of two > graphs and that of forming the merge. The simple union of two graphs > corresponds to the conjunction ( 'and' ) of all the triples in the graphs, > maintaining the identity of any blank nodes which occur in both graphs. > This is appropriate when the information in the graphs comes from a single > source, or where one is derived from the other by means of some valid > inference process, as for example when applying an inference rule to add a > triple to a graph. Merging two graphs treats the blank nodes in each graph > as being existentially quantified in that graph, so that no blank node from > one graph is allowed to stray into the scope of the other graph's > surrounding quantifier. This is appropriate when the graphs come from > different sources and there is no justification for assuming that a blank > node in one refers to the same entity as any blank node in the other. > ]] > http://www.w3.org/TR/rdf-mt/#unlabel > > How is that not handwaving? Introducing the notion of “documents” and > “sources” in Semantics strikes me as the completely wrong place. This > belongs in Concepts, and that's what my proposal achieves. > > > and a notion of "scope" that I still don't really see a formalisation > of. To get the notion of scope into the abstract syntax, you need a strict > formalisation of it, IMO. > > Why? > > >> You *cannot* produce a coherent account of the simple two-file > >> situation shown above without talking about scopes or sources > >> *somewhere*. > > > > RDF 2004 does not do that > > You've got to be kidding me. > > >> Currently, that talk is hidden away in Semantics where > >> most people won't see it (last paragraph of [2]). > > You see that [2] there? You didn't click it, right? Because you've read > the entire document years ago, right? And you don't remember it mentioning > document scopes or sources anywhere, right? Therefore, you didn't need to > click on that link, right? > > >>> (id,scope) is just a complicated way of defining a globally unique > >>> identifier for a bnode. > >> > >> That is nonsense. It's an explicit way of saying that id is not > >> globally unique. > > > > By your definition, a bnode is a pair (id,scope). > > It is not. > > http://www.w3.org/2011/rdf-wg/wiki/User:Rcygania2/B-Scopes#Specification_Changes > > > This pair belongs to the unique (so that it is "shared" by everyone, or > "global") set of pairs {(i,s)|i is a UNICODE string, s a scope}. > > Well, those pairs may be unique, but that still doesn't make them > identifiers, because identifiers are names, and scopes are not names. > > I use the word “scope” in its standard computer science usage: The scope > of an identifier is the context in which it has its meaning. > > [[ > Every RDF document forms its own, self-contained scope for blank nodes. > The handling of scopes outside of RDF documents (for example, in RDF > stores) is implementation-dependent. Other specifications MAY impose > additional scoping rules. > ]] > > Best, > Richard > > > > > > > > > AZ > > > >> > >> Best, Richard > >> > >> > >> [1] > >> > http://www.w3.org/2011/rdf-wg/wiki/User:Rcygania2/B-Scopes#Specification_Changes > >> > >> > > [2] http://www.w3.org/TR/rdf-mt/#unlabel > >> > >> > >> > >> On 17 Nov 2012, at 10:27, Antoine Zimmermann wrote: > >> > >>> I don't find this really useful, and even confusing. Like Andy, I > >>> see this as an implementation approach. > >>> > >>> (id,scope) is just a complicated way of defining a globally unique > >>> identifier for a bnode. > >>> > >>> What I would say instead is the following: > >>> > >>> """ Bnodes are drawn from an infinite set. Each bnode has a label > >>> being a UNICODE string, different from all other bnode labels. So > >>> when one draws a bnode, one can tell which bnode it is. > >>> Serialisation syntaxes that rely on bnode identifiers are in fact > >>> identifying the exact bnode they use. """ > >>> > >>> And everything stays the same. No other changes are required. > >>> > >>> It especially clarifies what this sentence means: > >>> > >>> """ Given two blank nodes, it is possible to determine whether or > >>> not they are the same. """ > >>> > >>> In RDF 2004, this sentence was never really implemented anywhere. > >>> If you got a bunch of triples, then another bunch of triples, you > >>> could not say which bnode of the first bunch were the same or > >>> different as the bnodes of the second bunch. > >>> > >>> There are cases when you want to split a graph into subgraphs, in > >>> which case you must know what bnodes actually appear in each > >>> subgraph. To get back the full graph from the subgraphs, it is > >>> required that you use set union, not merge. This requires that the > >>> bnodes are all identified in the same way across the subgraphs. > >>> > >>> Notice that a bnode label does not denote anything in terms of the > >>> formal semantics, so it has nothing to do with an IRI, and nothing > >>> to do with a skolem IRI. The label is only there to tell which > >>> bnode is used. It's an existential variable name and it can be > >>> replaced by any other variable name without changing the meaning of > >>> a graph. > >>> > >>> Fresh bnode may be defined formally as follows: > >>> > >>> """ Given a set of RDF graphs Sg, the triples of which containing a > >>> set Sb of bnodes, a fresh bnode with respect to Sg is a bnode b not > >>> in Sb. """ > >>> > >>> Of course, when we say "new", it has to be new wrt something > >>> predefined, thus the notion of "fresh bnode with respect to a set > >>> of RDF graphs". > >>> > >>> > >>> --AZ > >>> > >>> > >>> > >>> Le 14/11/2012 12:02, Richard Cyganiak a écrit : > >>>> Following recent discussions, I've written up a proposal to > >>>> change the design of blank nodes in RDF by explicitly introducing > >>>> scoped blank node identifiers into the abstract syntax. > >>>> > >>>> http://www.w3.org/2011/rdf-wg/wiki/User:Rcygania2/B-Scopes > >>>> > >>>> Requirements: > >>>> > >>>> • Consistency with all resolutions the WG has made so far • No > >>>> changes to other specs beyond Concepts and Semantics required • > >>>> No changes to conforming implementations required > >>>> > >>>> All further details are in the wiki. > >>>> > >>>> Comments welcome. > >>>> > >>>> Best, Richard > >>>> > >>> > >>> > >>> -- Antoine Zimmermann ISCOD / LSTI - Institut Henri Fayol École > >>> Nationale Supérieure des Mines de Saint-Étienne 158 cours Fauriel > >>> 42023 Saint-Étienne Cedex 2 France Tél:+33(0)4 77 42 66 03 > >>> Fax:+33(0)4 77 42 66 66 http://zimmer.aprilfoolsreview.com/ > >>> > >> > >> > >> > > > > > > -- > > Antoine Zimmermann > > ISCOD / LSTI - Institut Henri Fayol > > École Nationale Supérieure des Mines de Saint-Étienne > > 158 cours Fauriel > > 42023 Saint-Étienne Cedex 2 > > France > > Tél:+33(0)4 77 42 66 03 > > Fax:+33(0)4 77 42 66 66 > > http://zimmer.aprilfoolsreview.com/ > > > > >
Received on Monday, 19 November 2012 18:32:34 UTC