Re: [Concepts] Editorial changes to Blank Nodes (ISSUE-107) from Pierre-Antoine Champin on 2012-11-21 (public-rdf-wg@w3.org from November 2012)

From: Pierre-Antoine Champin <pierre-antoine.champin@liris.cnrs.fr>
Date: Wed, 21 Nov 2012 20:05:21 +0100
To: Pat Hayes <phayes@ihmc.us>
Cc: RDF Working Group WG <public-rdf-wg@w3.org>
Message-ID: <CA+OuRR_DE0xnCw6Q1Y6LQ4HA0e+kUsqig9uV1c0fZd28UCu2JQ@mail.gmail.com>
Pat,

my intention was not take a binding decision about fine details of wording,
more to explore how Richard's proposal (or at least my -- possibly biased
-- understanding of it)
could be expressed in a more declarative way, as it is how the rest of the
abstract syntax is defined.

My intention was not to interfere with the job of the editors, and I
apologize it that sounded that way.

  pa

On Wed, Nov 21, 2012 at 7:47 PM, Pat Hayes <phayes@ihmc.us> wrote:

> No, please. The whole idea of Richard's proposal was to avoid having to
> speak of identity of blank nodes, and the idea of a 'fresh' blank node. The
> fact is, that whole idea was, in retrospect, a conceptual mistake. There is
> no way to determine identity of blank nodes other than by checking blank
> node identifiers and scopes of those identifiers, and no way to replace one
> blank node by another, other than by changing a blank node identifier and
> relying on some convention by which this is understood to also change the
> blank nodes. And once we have identifiers and scopes, we really do not need
> to talk of identity or 'freshness' of blank nodes at all.
>
> And in any case, this wording is just as technically wrong, since G' does
> not contain the *same* triples as G (if it did, it would be the same
> graph.) A more correct wording would be that G' is graph-equivalent to G,
> but that definition (of graph equivalence) itself should be revized, IMO,
> to avoid talk of replacing one blank node by another.
>
> More generally, I think this proposal is another example of editorial
> micro-managing, and that there has been too much of this in the WG. The
> business of the WG should not be to discuss fine details of wordings before
> the relevant documents are even fully written. The Concepts and Semantics
> editors have a tricky job, to weave the various conceptual and technical
> threads together into coherent prose, where each notion only uses notions
> that have been defined earlier in the sequence. This requires one to take a
> large-scale overall view of the various ideas and definitions, and
> sometimes to modify them slightly to make sure that the sequence makes
> sense. If the WG takes binding decisions about fine details of wording
> ahead of time, this can be an impossible cramp upon the ability of the
> editors to write a coherent document.
>
> Too many cooks, and so on.
>
> Pat
>
>
> On Nov 21, 2012, at 10:10 AM, Pierre-Antoine Champin wrote:
>
> > Following today's discussion at the telecon,
> > and attempting to address Antoine's objection that mathematical
> strutctures are not "copied",
> > I propose the current rephrasing:
> >
> > The copy of an RDF graph G into a scope s is another RDF graph G'
> containing the same triples as G, but where every blank node has been
> replaced by a fresh blank node in the scope s. Note that occurrences... (no
> change after that).
> >
> > That's a more declarative of looking at the copy "operation".
> >
> >   pa
> >
> >
> > On Tue, Nov 13, 2012 at 5:28 AM, Pat Hayes <phayes@ihmc.us> wrote:
> >
> > On Nov 12, 2012, at 8:00 AM, Richard Cyganiak wrote:
> >
> > > On 12 Nov 2012, at 09:09, Antoine Zimmermann wrote:
> > >>> The *blank nodes* in an RDF graph are drawn from some arbitrary
> infinite
> > >>> set that fulfils the following conditions:
> > >>>
> > >>> • It is disjoint from the set of IRIs and the set of all literals.
> > >>> • Equality within the set is well-defined (*blank node equality*).
> > >>
> > >> What does the second item mean? Isn't equality well defined, in any
> set?
> >
> > Yes, it is. If identity isnt clear, then the very idea of a set is not
> clear.
> >
> > > The problem is that infinite sets cannot actually be implemented, and
> therefore implementations need to approximate the definition. The sentence
> draws attention to the requirement that in such approximate
> implementations, it must still be possible to test blank nodes for equality.
> >
> > This makes no sense at all to me. (Are you saying that identity of items
> in infinite sets must be in some sense approximate?? But I can take
> something from a finite set and include it in another, infinite, set and
> its still the same thing, so its identity does not change.)
> >
> > I don't see that implementation has anything to do with things at this
> point, as we are describing the abstract mathematical model.
> >
> > >
> > >> It is the same as saying "Given two blank nodes, it is possible to
> determine whether or not they are the same."
> > >
> > > Yes. It's a restatement of that phrase.
> >
> > Well, I guess it is, in that that phrase also doesn't make sense, though
> for different reasons.
> >
> > >
> > >> The later say that in an implementation, either the set of bnodes is
> explicitly known, or the implementation knows an isomorphism from a well
> known set to the set of bnodes. E.g., assign a bnode id to all bnodes, then
> one decides if two occurrences of bnodes involve the same bnode by simply
> comparing the identifiers.
> > >
> > > Sure, this is yet another restatement of the same phrase. Are you
> proposing a particular edit?
> > >
> > >>> Allocating a *fresh blank node* is the action of drawing a new node
> from the set.
> > >>> ]]
> > >>>
> http://dvcs.w3.org/hg/rdf/raw-file/default/rdf-concepts/index.html#section-blank-nodes
> > >>
> > >> It is not quite clear in what way it is "new". It has to be new wrt a
> given RDF graph (that is, a bnode that is not already used in a given RDF
> graph).
> > >
> > > That's not correct. It has to be globally new. Remember, blank nodes
> can be shared between graphs.
> >
> > But this does not make sense as stated. To make things more concrete,
> consider the set N of natural numbers, and consider the statement:
> "Allocating a fresh number is the action of drawing a new natural number
> from the set N."  What could this possibly mean? How does one "draw" a
> number "from" a set of numbers? (Does that just mean "Choose a number"?)
> What would "new" mean in this context? What kind of "global" would be
> meaningful?  Blank nodes are just like numbers in this way.
> >
> > Its a mistake to think of this in 'process' terms, that gets things
> confused. The whole idea of blank nodes was to be a simple underlying
> *mathematical* model which would underlie any particular processing or
> implementaiton strategy.
> >
> > >
> > >>> [[
> > >>> Since RDF systems generally refer to blank nodes only via such local
> identifiers, it is necessary to “standardize apart” the blank node
> identifiers when incorporating data that originates from an external
> source. This may be done by systematically replacing the blank node
> identifiers in incoming data with freshly allocated blank node identifiers.
> > >>> ]]
> > >>
> > >> In fact, if the bnode IDs had global scope, this would still be
> necessary. The "standardisation apart" is part of the merge operation and
> is independent of the way bnodes are identified. The "standardisation
> apart" has to be made at the abstract syntax level, that is, the bnodes
> themselves, not the IDs, have to be changed.
> > >
> > > This isn't about the merge operation; this is about the case of, say,
> loading a graph into a new slot in a graph store. If blank node identifiers
> in the incoming data are systematically replaced, then, I believe, this
> operation is safe; otherwise it is not.
> >
> > The *operation* is safe, yes. But what we are debating is how to express
> the underlying mathematical model so as to make this work. And the
> assumption we need is that the blank nodes themselves are not shared
> between the newly loaded graph and the graphs already present in the store.
> >
> > >
> > > Graph merge is a separate issue, and we don't talk about it in this
> section.
> > >
> > >> [As a side note, I think things would have been simpler, IMHO, if all
> bnodes had a globally unique identifier. It would also have made the
> discussions on the scope of bnodes easier, since we would have avoided
> discussing the scope of *identifiers*, and confusing the two types of
> scopes.]
> > >
> > > If they have globally unique identifiers, then surely they should be
> IRIs, no? And then how is that any different from not having blank nodes at
> all?
> >
> > I tend to agree with you here, and against Antoine. But lets not even
> try to go there :-)
> >
> > >
> > > It certainly is a mess.
> >
> > It is actually quite simple, but it is not easy to explain, which I
> guess does make it into a mess. Mia culpa. I thought that having this idea
> of a 'global' set of blank nodes would be a way to avoid the other
> well-known mess of dealing with scoping rules for bound variables, which
> would have been the obvious way to handle (what are now called) blank nodes
> in RDF. Scoping and the bound/free distinction were notoriously hard to
> describe in relational logic, but programmers are so used to this way of
> thinking that it might have worked better than the simpler bnode idea.
> Wisdom after the event, I guess.
> >
> > Pat
> >
> > >
> > > Best,
> > > Richard
> > >
> >
> > ------------------------------------------------------------
> > IHMC                                     (850)434 8903 or (650)494 3973
> > 40 South Alcaniz St.           (850)202 4416   office
> > Pensacola                            (850)202 4440   fax
> > FL 32502                              (850)291 0667   mobile
> > phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
> >
> >
> >
> >
> >
> >
> >
>
> ------------------------------------------------------------
> IHMC                                     (850)434 8903 or (650)494 3973
> 40 South Alcaniz St.           (850)202 4416   office
> Pensacola                            (850)202 4440   fax
> FL 32502                              (850)291 0667   mobile
> phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
>
>
>
>
>
>
Received on Wednesday, 21 November 2012 19:05:49 UTC