Re: [Concepts] Editorial changes to Blank Nodes (ISSUE-107) from Richard Cyganiak on 2012-11-12 (public-rdf-wg@w3.org from November 2012)

From: Richard Cyganiak <richard@cyganiak.de>
Date: Mon, 12 Nov 2012 16:00:13 +0000
To: Antoine Zimmermann <antoine.zimmermann@emse.fr>
Cc: RDF Working Group WG <public-rdf-wg@w3.org>
Message-Id: <E3D3B654-E2C9-4635-A5D8-813E57EF4356@cyganiak.de>

On 12 Nov 2012, at 09:09, Antoine Zimmermann wrote:
>> The *blank nodes* in an RDF graph are drawn from some arbitrary infinite
>> set that fulfils the following conditions:
>> 
>> • It is disjoint from the set of IRIs and the set of all literals.
>> • Equality within the set is well-defined (*blank node equality*).
> 
> What does the second item mean? Isn't equality well defined, in any set?

The problem is that infinite sets cannot actually be implemented, and therefore implementations need to approximate the definition. The sentence draws attention to the requirement that in such approximate implementations, it must still be possible to test blank nodes for equality.

> It is the same as saying "Given two blank nodes, it is possible to determine whether or not they are the same."

Yes. It's a restatement of that phrase.

> The later say that in an implementation, either the set of bnodes is explicitly known, or the implementation knows an isomorphism from a well known set to the set of bnodes. E.g., assign a bnode id to all bnodes, then one decides if two occurrences of bnodes involve the same bnode by simply comparing the identifiers.

Sure, this is yet another restatement of the same phrase. Are you proposing a particular edit?

>> Allocating a *fresh blank node* is the action of drawing a new node from the set.
>> ]]
>> http://dvcs.w3.org/hg/rdf/raw-file/default/rdf-concepts/index.html#section-blank-nodes
> 
> It is not quite clear in what way it is "new". It has to be new wrt a given RDF graph (that is, a bnode that is not already used in a given RDF graph).

That's not correct. It has to be globally new. Remember, blank nodes can be shared between graphs.

>> [[
>> Since RDF systems generally refer to blank nodes only via such local identifiers, it is necessary to “standardize apart” the blank node identifiers when incorporating data that originates from an external source. This may be done by systematically replacing the blank node identifiers in incoming data with freshly allocated blank node identifiers.
>> ]]
> 
> In fact, if the bnode IDs had global scope, this would still be necessary. The "standardisation apart" is part of the merge operation and is independent of the way bnodes are identified. The "standardisation apart" has to be made at the abstract syntax level, that is, the bnodes themselves, not the IDs, have to be changed.

This isn't about the merge operation; this is about the case of, say, loading a graph into a new slot in a graph store. If blank node identifiers in the incoming data are systematically replaced, then, I believe, this operation is safe; otherwise it is not.

Graph merge is a separate issue, and we don't talk about it in this section.

> [As a side note, I think things would have been simpler, IMHO, if all bnodes had a globally unique identifier. It would also have made the discussions on the scope of bnodes easier, since we would have avoided discussing the scope of *identifiers*, and confusing the two types of scopes.]

If they have globally unique identifiers, then surely they should be IRIs, no? And then how is that any different from not having blank nodes at all?

It certainly is a mess.

Best,
Richard

Received on Monday, 12 November 2012 16:00:46 UTC