Re: Well Behaved RDF - Taming Blank Nodes, etc. from Ivan Shmakov on 2012-12-13 (semantic-web@w3.org from December 2012)

From: Ivan Shmakov <oneingray@gmail.com>
Date: Fri, 14 Dec 2012 00:39:51 +0700
To: semantic-web@w3.org
Message-ID: <86k3slq2ag.fsf@gray.siamics.net>

>>>>> David Booth <david@dbooth.org> writes:
>>>>> On Thu, 2012-12-13 at 01:04 +0700, Ivan Shmakov wrote:
>>>>> David Booth <david@dbooth.org> writes:

[…]

 >>> A Well-Behaved RDF graph is an RDF graph that can be serialized as
 >>> Turtle without the use of explicit blank node identifiers.  I. e.,
 >>> only blank nodes that are implicitly created by the bracket "[
 >>> ... ]" or list "( ... )" notations are permitted.

 >> My suggestion would be to only disallow cycles composed entirely of
 >> blank nodes.

 >> As it seems, an RDF graph following this restriction could be
 >> “canonicalized” quite easily, . . . .

 > Yes, that is pretty much my intent, but I thought it would be easier
 > to explain (to casual RDF authors) by defining it in terms of
 > avoiding the need for explicit blank node identifiers in Turtle.
 > Otherwise we would have to get into the details of defining what is a
 > blank node tree and how it relates to the RDF graph, which seemed
 > harder to explain.  But I could be wrong.

 The question is: how do we state that two nodes relate to a
 single blank node?  Say, :john and :mary are siblings, thus
 their :mother and :father are two same :Person's.  (Though
 perhaps such a case — that there /may/ be blank node
 identifiers, provided that they're only referenced as Objects
 for the arcs with a non-blank Subject — may be formulated as an
 exception to the “no blank node identifiers” rule?)

 But I've got the point.  Indeed, for my own needs, I've
 contemplated “weakening” graphs like the following one:

_:father a :Person .
:john  :father _:father .
:marth :father _:father .

 down to (the Well-Behaved):

:john  :father [ a :Person ] .
:marth :father [ a :Person ] .

 There's a sure loss of information in treating the former graph
 as if it was the latter, but somehow, it seems that not only it
 makes the thing easier to explain, but also may further simplify
 the data management.  (For instance, if no blank node is ever
 allowed to be the “same” as another, one could use content-based
 identifiers for all the blank nodes, never ever having to decide
 whether this is the same [ a :Person ] or not.)

 And for the task I've been considering it for, the information
 lost was likely to be insignificant anyway, as for every other
 use there was to be a URI (either a freshly-minted urn:uuid:
 one, or some other.)

-- 
FSF associate member #7257

Received on Thursday, 13 December 2012 17:44:30 UTC