Re: [Concepts] Editorial changes to Blank Nodes (ISSUE-107) from Antoine Zimmermann on 2012-11-13 (public-rdf-wg@w3.org from November 2012)

From: Antoine Zimmermann <antoine.zimmermann@emse.fr>
Date: Tue, 13 Nov 2012 01:58:57 +0100
To: Richard Cyganiak <richard@cyganiak.de>
CC: RDF Working Group WG <public-rdf-wg@w3.org>
Message-ID: <50A19B51.8090206@emse.fr>
More comments below.

Le 12/11/2012 17:00, Richard Cyganiak a écrit :
> On 12 Nov 2012, at 09:09, Antoine Zimmermann wrote:
>>> The *blank nodes* in an RDF graph are drawn from some arbitrary
>>> infinite set that fulfils the following conditions:
>>>
>>> • It is disjoint from the set of IRIs and the set of all
>>> literals. • Equality within the set is well-defined (*blank node
>>> equality*).
>>
>> What does the second item mean? Isn't equality well defined, in any
>> set?
>
> The problem is that infinite sets cannot actually be implemented, and
> therefore implementations need to approximate the definition.

What do you mean by "implementing infinite sets"? Character strings, 
integers, decimals, RDF triples, RDF graphs, etc are all members of 
infinite sets.

> The
> sentence draws attention to the requirement that in such approximate
> implementations, it must still be possible to test blank nodes for
> equality.

The sentence says that "equality within the set is well-defined". It is 
not talking about implementation.

>> It is the same as saying "Given two blank nodes, it is possible to
>> determine whether or not they are the same."

As I said, I meant "it is not the same as [etc]". This is about 
implementation.

> Yes. It's a restatement of that phrase.
>
>> The later say that in an implementation, either the set of bnodes
>> is explicitly known, or the implementation knows an isomorphism
>> from a well known set to the set of bnodes. E.g., assign a bnode id
>> to all bnodes, then one decides if two occurrences of bnodes
>> involve the same bnode by simply comparing the identifiers.
>
> Sure, this is yet another restatement of the same phrase. Are you
> proposing a particular edit?
>
>>> Allocating a *fresh blank node* is the action of drawing a new
>>> node from the set. ]]
>>> http://dvcs.w3.org/hg/rdf/raw-file/default/rdf-concepts/index.html#section-blank-nodes
>>
>>
>>>
It is not quite clear in what way it is "new". It has to be new wrt a 
given RDF graph (that is, a bnode that is not already used in a given 
RDF graph).
>
> That's not correct. It has to be globally new. Remember, blank nodes
> can be shared between graphs.
>
>>> [[ Since RDF systems generally refer to blank nodes only via such
>>> local identifiers, it is necessary to “standardize apart” the
>>> blank node identifiers when incorporating data that originates
>>> from an external source. This may be done by systematically
>>> replacing the blank node identifiers in incoming data with
>>> freshly allocated blank node identifiers. ]]

In many case, bnodes are provided via syntax sugar like:

  [] <p> <o> .

where no identifier occurs.

>>
>> In fact, if the bnode IDs had global scope, this would still be
>> necessary. The "standardisation apart" is part of the merge
>> operation and is independent of the way bnodes are identified. The
>> "standardisation apart" has to be made at the abstract syntax
>> level, that is, the bnodes themselves, not the IDs, have to be
>> changed.
>
> This isn't about the merge operation; this is about the case of, say,
> loading a graph into a new slot in a graph store. If blank node
> identifiers in the incoming data are systematically replaced, then, I
> believe, this operation is safe; otherwise it is not.
>
> Graph merge is a separate issue, and we don't talk about it in this
> section.
>
>> [As a side note, I think things would have been simpler, IMHO, if
>> all bnodes had a globally unique identifier. It would also have
>> made the discussions on the scope of bnodes easier, since we would
>> have avoided discussing the scope of *identifiers*, and confusing
>> the two types of scopes.]
>
> If they have globally unique identifiers, then surely they should be
> IRIs, no? And then how is that any different from not having blank
> nodes at all?

To make it clearer, imagine you have a pouch with an unlimited set of 
ball, all of which have a unique number on them. You also have special 
items that have IRIs on them. You use these balls and items to arange 
triples like:

  [b1]  rdf:type  foaf:Person .

then someone else take your ball and make the following:

  [b1]  rdf:type  foaf:Organization .

the ball is really the same. Yet, there is no problem distinguishing the 
meaning of the first statement (which only says that there exists a 
person) and the second statement (which only says that there exists an 
organisation). These two constructions never pretend that the person and 
the organisation are one and only one entity. If I want that both 
constructions be true at the same time, I have to pick two distinct 
balls from the pouch (the "standardisation apart" operation) to make the 
following construction:

  [bX]  rdf:type  foaf:Person .
  [bY]  rdf:type  foaf:Organization .

It does not matter which balls I take, as long as I separate the one 
from the first statement and the one from the second.


AZ.


> It certainly is a mess.
>
> Best, Richard
>


-- 
Antoine Zimmermann
ISCOD / LSTI - Institut Henri Fayol
École Nationale Supérieure des Mines de Saint-Étienne
158 cours Fauriel
42023 Saint-Étienne Cedex 2
France
Tél:+33(0)4 77 42 66 03
Fax:+33(0)4 77 42 66 66
http://zimmer.aprilfoolsreview.com/
Received on Tuesday, 13 November 2012 00:57:58 UTC