Re: Well Behaved RDF - Taming Blank Nodes, etc.

On Wed, 2012-12-12 at 09:43 -0800, Pat Hayes wrote:
> On Dec 12, 2012, at 9:01 AM, David Booth wrote:
[ . . . ]
> >  A Well-Behaved RDF graph is an RDF graph that can be serialized 
> >  as Turtle without the use of explicit blank node identifiers. 
> >  I.e., only blank nodes that are implicitly created by the 
> >  bracket "[ ... ]" or list "( ... )" notations are permitted. 
> 
> That is too restrictive. There is a real need to be able to describe
> things such as "Joe's father" or "a woman in a red dress" which are
> naturally phrased as bnodes with identifying descriptors attached to
> them. 

Perhaps, for some RDF authors.  And those authors could use full RDF
instead of the Well Behaved RDF profile.  But according to
http://web.ing.puc.cl/~marenas/publications/iswc11.pdf
the vast majority of RDF documents (over 98% of their samples) use blank
nodes in non-problematic ways.  (I.e., they contain no blank node
cycles, and thus do not cause the graph isomorphism complexity problem.)
At present the many applications that process RDF have to pay for the
sins of those (very) few RDF graphs that use blank nodes in problematic
ways.

Actually, it would be interesting to examine whether those <2% of graphs
that did have blank node cycles really needed them.  My suspicion is
that the authors could have simply minted a few URIs to break those
blank node cycles and turn them into non-problematic blank node trees.
In the nearly 4 million RDF documents Mallea, Arenas, Hogan, and
Polleres examined, the maximum blank node treewidth they found was 7,
which I think (though a graph theory expert would have to confirm) that
only 6 URIs would have to have been minted to turn it into a tree.



-- 
David Booth, Ph.D.
http://dbooth.org/

Opinions expressed herein are those of the author and do not necessarily
reflect those of his employer.

Received on Wednesday, 12 December 2012 19:09:53 UTC