W3C home > Mailing lists > Public > public-rdf-wg@w3.org > November 2012

Re: [Concepts] Editorial changes to Blank Nodes (ISSUE-107)

From: Pat Hayes <phayes@ihmc.us>
Date: Wed, 21 Nov 2012 11:28:56 -0800
Cc: RDF Working Group WG <public-rdf-wg@w3.org>
Message-Id: <76A59B92-D5AE-4282-8E88-EA2201B07ED5@ihmc.us>
To: Pierre-Antoine Champin <pierre-antoine.champin@liris.cnrs.fr>
Pierre, sorry. My rant about micromanaging was not intended to be directed at you, but rather a general complaint to the WG. Sorry, a bit rushed this morning. 

Pat

On Nov 21, 2012, at 11:05 AM, Pierre-Antoine Champin wrote:

> Pat,
> 
> my intention was not take a binding decision about fine details of wording,
> more to explore how Richard's proposal (or at least my -- possibly biased -- understanding of it)
> could be expressed in a more declarative way, as it is how the rest of the abstract syntax is defined.
> 
> My intention was not to interfere with the job of the editors, and I apologize it that sounded that way.
> 
>   pa
> 
> On Wed, Nov 21, 2012 at 7:47 PM, Pat Hayes <phayes@ihmc.us> wrote:
> No, please. The whole idea of Richard's proposal was to avoid having to speak of identity of blank nodes, and the idea of a 'fresh' blank node. The fact is, that whole idea was, in retrospect, a conceptual mistake. There is no way to determine identity of blank nodes other than by checking blank node identifiers and scopes of those identifiers, and no way to replace one blank node by another, other than by changing a blank node identifier and relying on some convention by which this is understood to also change the blank nodes. And once we have identifiers and scopes, we really do not need to talk of identity or 'freshness' of blank nodes at all.
> 
> And in any case, this wording is just as technically wrong, since G' does not contain the *same* triples as G (if it did, it would be the same graph.) A more correct wording would be that G' is graph-equivalent to G, but that definition (of graph equivalence) itself should be revized, IMO, to avoid talk of replacing one blank node by another.
> 
> More generally, I think this proposal is another example of editorial micro-managing, and that there has been too much of this in the WG. The business of the WG should not be to discuss fine details of wordings before the relevant documents are even fully written. The Concepts and Semantics editors have a tricky job, to weave the various conceptual and technical threads together into coherent prose, where each notion only uses notions that have been defined earlier in the sequence. This requires one to take a large-scale overall view of the various ideas and definitions, and sometimes to modify them slightly to make sure that the sequence makes sense. If the WG takes binding decisions about fine details of wording ahead of time, this can be an impossible cramp upon the ability of the editors to write a coherent document.
> 
> Too many cooks, and so on.
> 
> Pat
> 
> 
> On Nov 21, 2012, at 10:10 AM, Pierre-Antoine Champin wrote:
> 
> > Following today's discussion at the telecon,
> > and attempting to address Antoine's objection that mathematical strutctures are not "copied",
> > I propose the current rephrasing:
> >
> > The copy of an RDF graph G into a scope s is another RDF graph G' containing the same triples as G, but where every blank node has been replaced by a fresh blank node in the scope s. Note that occurrences... (no change after that).
> >
> > That's a more declarative of looking at the copy "operation".
> >
> >   pa
> >
> >
> > On Tue, Nov 13, 2012 at 5:28 AM, Pat Hayes <phayes@ihmc.us> wrote:
> >
> > On Nov 12, 2012, at 8:00 AM, Richard Cyganiak wrote:
> >
> > > On 12 Nov 2012, at 09:09, Antoine Zimmermann wrote:
> > >>> The *blank nodes* in an RDF graph are drawn from some arbitrary infinite
> > >>> set that fulfils the following conditions:
> > >>>
> > >>> • It is disjoint from the set of IRIs and the set of all literals.
> > >>> • Equality within the set is well-defined (*blank node equality*).
> > >>
> > >> What does the second item mean? Isn't equality well defined, in any set?
> >
> > Yes, it is. If identity isnt clear, then the very idea of a set is not clear.
> >
> > > The problem is that infinite sets cannot actually be implemented, and therefore implementations need to approximate the definition. The sentence draws attention to the requirement that in such approximate implementations, it must still be possible to test blank nodes for equality.
> >
> > This makes no sense at all to me. (Are you saying that identity of items in infinite sets must be in some sense approximate?? But I can take something from a finite set and include it in another, infinite, set and its still the same thing, so its identity does not change.)
> >
> > I don't see that implementation has anything to do with things at this point, as we are describing the abstract mathematical model.
> >
> > >
> > >> It is the same as saying "Given two blank nodes, it is possible to determine whether or not they are the same."
> > >
> > > Yes. It's a restatement of that phrase.
> >
> > Well, I guess it is, in that that phrase also doesn't make sense, though for different reasons.
> >
> > >
> > >> The later say that in an implementation, either the set of bnodes is explicitly known, or the implementation knows an isomorphism from a well known set to the set of bnodes. E.g., assign a bnode id to all bnodes, then one decides if two occurrences of bnodes involve the same bnode by simply comparing the identifiers.
> > >
> > > Sure, this is yet another restatement of the same phrase. Are you proposing a particular edit?
> > >
> > >>> Allocating a *fresh blank node* is the action of drawing a new node from the set.
> > >>> ]]
> > >>> http://dvcs.w3.org/hg/rdf/raw-file/default/rdf-concepts/index.html#section-blank-nodes
> > >>
> > >> It is not quite clear in what way it is "new". It has to be new wrt a given RDF graph (that is, a bnode that is not already used in a given RDF graph).
> > >
> > > That's not correct. It has to be globally new. Remember, blank nodes can be shared between graphs.
> >
> > But this does not make sense as stated. To make things more concrete, consider the set N of natural numbers, and consider the statement: "Allocating a fresh number is the action of drawing a new natural number from the set N."  What could this possibly mean? How does one "draw" a number "from" a set of numbers? (Does that just mean "Choose a number"?) What would "new" mean in this context? What kind of "global" would be meaningful?  Blank nodes are just like numbers in this way.
> >
> > Its a mistake to think of this in 'process' terms, that gets things confused. The whole idea of blank nodes was to be a simple underlying *mathematical* model which would underlie any particular processing or implementaiton strategy.
> >
> > >
> > >>> [[
> > >>> Since RDF systems generally refer to blank nodes only via such local identifiers, it is necessary to “standardize apart” the blank node identifiers when incorporating data that originates from an external source. This may be done by systematically replacing the blank node identifiers in incoming data with freshly allocated blank node identifiers.
> > >>> ]]
> > >>
> > >> In fact, if the bnode IDs had global scope, this would still be necessary. The "standardisation apart" is part of the merge operation and is independent of the way bnodes are identified. The "standardisation apart" has to be made at the abstract syntax level, that is, the bnodes themselves, not the IDs, have to be changed.
> > >
> > > This isn't about the merge operation; this is about the case of, say, loading a graph into a new slot in a graph store. If blank node identifiers in the incoming data are systematically replaced, then, I believe, this operation is safe; otherwise it is not.
> >
> > The *operation* is safe, yes. But what we are debating is how to express the underlying mathematical model so as to make this work. And the assumption we need is that the blank nodes themselves are not shared between the newly loaded graph and the graphs already present in the store.
> >
> > >
> > > Graph merge is a separate issue, and we don't talk about it in this section.
> > >
> > >> [As a side note, I think things would have been simpler, IMHO, if all bnodes had a globally unique identifier. It would also have made the discussions on the scope of bnodes easier, since we would have avoided discussing the scope of *identifiers*, and confusing the two types of scopes.]
> > >
> > > If they have globally unique identifiers, then surely they should be IRIs, no? And then how is that any different from not having blank nodes at all?
> >
> > I tend to agree with you here, and against Antoine. But lets not even try to go there :-)
> >
> > >
> > > It certainly is a mess.
> >
> > It is actually quite simple, but it is not easy to explain, which I guess does make it into a mess. Mia culpa. I thought that having this idea of a 'global' set of blank nodes would be a way to avoid the other well-known mess of dealing with scoping rules for bound variables, which would have been the obvious way to handle (what are now called) blank nodes in RDF. Scoping and the bound/free distinction were notoriously hard to describe in relational logic, but programmers are so used to this way of thinking that it might have worked better than the simpler bnode idea. Wisdom after the event, I guess.
> >
> > Pat
> >
> > >
> > > Best,
> > > Richard
> > >
> >
> > ------------------------------------------------------------
> > IHMC                                     (850)434 8903 or (650)494 3973
> > 40 South Alcaniz St.           (850)202 4416   office
> > Pensacola                            (850)202 4440   fax
> > FL 32502                              (850)291 0667   mobile
> > phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
> >
> >
> >
> >
> >
> >
> >
> 
> ------------------------------------------------------------
> IHMC                                     (850)434 8903 or (650)494 3973
> 40 South Alcaniz St.           (850)202 4416   office
> Pensacola                            (850)202 4440   fax
> FL 32502                              (850)291 0667   mobile
> phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
> 
> 
> 
> 
> 
> 

------------------------------------------------------------
IHMC                                     (850)434 8903 or (650)494 3973   
40 South Alcaniz St.           (850)202 4416   office
Pensacola                            (850)202 4440   fax
FL 32502                              (850)291 0667   mobile
phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
Received on Wednesday, 21 November 2012 19:29:28 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 16:25:53 GMT