Re: a blank node issue from Ivan Shmakov on 2011-03-01 (semantic-web@w3.org from March 2011)

From: Ivan Shmakov <ivan@main.uusia.org>
Date: Wed, 02 Mar 2011 00:09:06 +0600
To: semantic-web@w3.org
Message-ID: <87aahebtml.fsf@violet.siamics.net>
>>>>> Pat Hayes <phayes@ihmc.us> writes:

[…]

 > But before you label this an "issue", let me turn the scenario
 > around.

[…]

 > Now serialize these "identical" graphs into two identical
 > serializations and send them to a common source and ask it to
 > deserialize them into a single graph. Should it merge these blank
 > nodes into one? It may well be that if more information had been
 > sent, it would have been clear that these were two different
 > people. But even if not, it is clear that can be no general warrant
 > to presume that two different blank nodes must co-refer, unless of
 > course one knows that the provenance of the information guarantees
 > that they do.

 Actually, the question I'm concerned with is exactly the
 opposite one: is there any practical necessity to /preserve/
 blank node identity when used /as an object/?

 To repeat myself, while serializing subgraphs, it's easy, given
 the current standards and implementations, to “break” the
 following graph:

foo bar _:blank .
baz qux _:blank .

 into the one where the subjects of the triples aren't the same:

foo bar _:blank1 .
baz qux _:blank2 .

 (Though the respective descriptions of the blank nodes are the
 same.)

 Now, I wonder, what would be the negative consequences in
 practice should we assume that such a “breakage” is not an
 exception, but a rule.  Or, in other words, that the blank node
 identity /as an object/ is of no semantic value.

 Immediately, it becomes possible:

 • to re-create any graph from the set of concise bounded
   descriptions [1] of its respective (non-blank) subjects;

 • to assign each blank node a content-based identifier (e. g.,
          as per [2].)

 (Leaving cyclic subgraphs involving blank nodes aside for now.)

 Both of the above are of importance to the distribution of the
 descriptions, since data distribution all too often relies on
 the ways to split the data and name the chunks.

 The feasibility of such an approach depends on what the blank
 nodes are used in practice for.  The first blank node example in
 RDF Primer [3] is all about a blank node that is “defined” by
 its own properties: it doesn't matter whether particular
 addresses are given by the same blank node or not — it's the
 /properties/ of these nodes (or node) that matter.  Also, [1]
 argues against overuse of the blank nodes.

 … There's a distant, yet vaguely similar, aspect of the IEEE
 floating point arithmetics.  Namely, certain operations may give
 a “not a number” (or NaN) result.  The core property of NaN is
 that it isn't equal to itself.  That is, if x = sqrt (-1), then
 x == x is false.  Perhaps, a blank node is just a Semantic NaN?

 But, honestly, I'm not sure.

PS.  Is there any good example of when a cyclic subgraph involving a
 blank node may be reasonable, BTW?

[…]

[1] http://www.w3.org/Submission/CBD/
[2] http://www.hpl.hp.com/techreports/2002/HPL-2002-216.pdf
[3] http://www.w3.org/TR/rdf-primer/

-- 
FSF associate member #7257
Received on Tuesday, 1 March 2011 18:09:55 UTC