Re: [Concepts] Editorial changes to Blank Nodes (ISSUE-107) from Pierre-Antoine Champin on 2012-11-21 (public-rdf-wg@w3.org from November 2012)

From: Pierre-Antoine Champin <pierre-antoine.champin@liris.cnrs.fr>
Date: Wed, 21 Nov 2012 19:10:54 +0100
To: RDF Working Group WG <public-rdf-wg@w3.org>
Message-ID: <CA+OuRR-d2HL1SNjK_djiXiE3Gir5BkFcmJtz4Zk57RLBNm2Hpg@mail.gmail.com>
Following today's discussion at the telecon,
and attempting to address Antoine's objection that mathematical strutctures
are not "copied",
I propose the current rephrasing:

The *copy* of an RDF graph G into a scope s is another RDF graph G'
containing the same triples as G, but where every blank node has been
replaced by a fresh blank node in the scope s. Note that occurrences... (no
change after that).

That's a more declarative of looking at the copy "operation".

  pa


On Tue, Nov 13, 2012 at 5:28 AM, Pat Hayes <phayes@ihmc.us> wrote:

>
> On Nov 12, 2012, at 8:00 AM, Richard Cyganiak wrote:
>
> > On 12 Nov 2012, at 09:09, Antoine Zimmermann wrote:
> >>> The *blank nodes* in an RDF graph are drawn from some arbitrary
> infinite
> >>> set that fulfils the following conditions:
> >>>
> >>> • It is disjoint from the set of IRIs and the set of all literals.
> >>> • Equality within the set is well-defined (*blank node equality*).
> >>
> >> What does the second item mean? Isn't equality well defined, in any set?
>
> Yes, it is. If identity isnt clear, then the very idea of a set is not
> clear.
>
> > The problem is that infinite sets cannot actually be implemented, and
> therefore implementations need to approximate the definition. The sentence
> draws attention to the requirement that in such approximate
> implementations, it must still be possible to test blank nodes for equality.
>
> This makes no sense at all to me. (Are you saying that identity of items
> in infinite sets must be in some sense approximate?? But I can take
> something from a finite set and include it in another, infinite, set and
> its still the same thing, so its identity does not change.)
>
> I don't see that implementation has anything to do with things at this
> point, as we are describing the abstract mathematical model.
>
> >
> >> It is the same as saying "Given two blank nodes, it is possible to
> determine whether or not they are the same."
> >
> > Yes. It's a restatement of that phrase.
>
> Well, I guess it is, in that that phrase also doesn't make sense, though
> for different reasons.
>
> >
> >> The later say that in an implementation, either the set of bnodes is
> explicitly known, or the implementation knows an isomorphism from a well
> known set to the set of bnodes. E.g., assign a bnode id to all bnodes, then
> one decides if two occurrences of bnodes involve the same bnode by simply
> comparing the identifiers.
> >
> > Sure, this is yet another restatement of the same phrase. Are you
> proposing a particular edit?
> >
> >>> Allocating a *fresh blank node* is the action of drawing a new node
> from the set.
> >>> ]]
> >>>
> http://dvcs.w3.org/hg/rdf/raw-file/default/rdf-concepts/index.html#section-blank-nodes
> >>
> >> It is not quite clear in what way it is "new". It has to be new wrt a
> given RDF graph (that is, a bnode that is not already used in a given RDF
> graph).
> >
> > That's not correct. It has to be globally new. Remember, blank nodes can
> be shared between graphs.
>
> But this does not make sense as stated. To make things more concrete,
> consider the set N of natural numbers, and consider the statement:
> "Allocating a fresh number is the action of drawing a new natural number
> from the set N."  What could this possibly mean? How does one "draw" a
> number "from" a set of numbers? (Does that just mean "Choose a number"?)
> What would "new" mean in this context? What kind of "global" would be
> meaningful?  Blank nodes are just like numbers in this way.
>
> Its a mistake to think of this in 'process' terms, that gets things
> confused. The whole idea of blank nodes was to be a simple underlying
> *mathematical* model which would underlie any particular processing or
> implementaiton strategy.
>
> >
> >>> [[
> >>> Since RDF systems generally refer to blank nodes only via such local
> identifiers, it is necessary to “standardize apart” the blank node
> identifiers when incorporating data that originates from an external
> source. This may be done by systematically replacing the blank node
> identifiers in incoming data with freshly allocated blank node identifiers.
> >>> ]]
> >>
> >> In fact, if the bnode IDs had global scope, this would still be
> necessary. The "standardisation apart" is part of the merge operation and
> is independent of the way bnodes are identified. The "standardisation
> apart" has to be made at the abstract syntax level, that is, the bnodes
> themselves, not the IDs, have to be changed.
> >
> > This isn't about the merge operation; this is about the case of, say,
> loading a graph into a new slot in a graph store. If blank node identifiers
> in the incoming data are systematically replaced, then, I believe, this
> operation is safe; otherwise it is not.
>
> The *operation* is safe, yes. But what we are debating is how to express
> the underlying mathematical model so as to make this work. And the
> assumption we need is that the blank nodes themselves are not shared
> between the newly loaded graph and the graphs already present in the store.
>
> >
> > Graph merge is a separate issue, and we don't talk about it in this
> section.
> >
> >> [As a side note, I think things would have been simpler, IMHO, if all
> bnodes had a globally unique identifier. It would also have made the
> discussions on the scope of bnodes easier, since we would have avoided
> discussing the scope of *identifiers*, and confusing the two types of
> scopes.]
> >
> > If they have globally unique identifiers, then surely they should be
> IRIs, no? And then how is that any different from not having blank nodes at
> all?
>
> I tend to agree with you here, and against Antoine. But lets not even try
> to go there :-)
>
> >
> > It certainly is a mess.
>
> It is actually quite simple, but it is not easy to explain, which I guess
> does make it into a mess. Mia culpa. I thought that having this idea of a
> 'global' set of blank nodes would be a way to avoid the other well-known
> mess of dealing with scoping rules for bound variables, which would have
> been the obvious way to handle (what are now called) blank nodes in RDF.
> Scoping and the bound/free distinction were notoriously hard to describe in
> relational logic, but programmers are so used to this way of thinking that
> it might have worked better than the simpler bnode idea. Wisdom after the
> event, I guess.
>
> Pat
>
> >
> > Best,
> > Richard
> >
>
> ------------------------------------------------------------
> IHMC                                     (850)434 8903 or (650)494 3973
> 40 South Alcaniz St.           (850)202 4416   office
> Pensacola                            (850)202 4440   fax
> FL 32502                              (850)291 0667   mobile
> phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
>
>
>
>
>
>
>
Received on Wednesday, 21 November 2012 18:11:27 UTC