Re: Converting RDF to JSON-LD : shared lists between graphs from David Booth on 2014-07-28 (public-rdf-comments@w3.org from July 2014)

From: David Booth <david@dbooth.org>
Date: Mon, 28 Jul 2014 00:46:06 -0400
To: Dan Brickley <danbri@google.com>
CC: Pat Hayes <phayes@ihmc.us>, Kingsley Idehen <kidehen@openlinksw.com>, public-rdf-comments Comments <public-rdf-comments@w3.org>
Message-ID: <53D5D58E.1020103@dbooth.org>

On 07/27/2014 12:29 PM, Dan Brickley wrote:
> On 26 July 2014 03:22, David Booth <david@dbooth.org> wrote:
[ . . . ]
>> That's a good example of how the vast majority of RDF does not need explicit
>> blank nodes.  The above can be written just fine as Well Behaved RDF using
>> only implicit blank nodes:
>>
>> [] a :MusicEvent ;
>>    :name "B.B. King with Jonathon 'Boogie' Long" ;
>>    :location [
>>      a :Place ;
>>      :address [ a :PostalAddress
>>        :streetAddress "..." ;
>>        :postalCode "..." . ] .
>>        ] ;
>>    :performer [ a :MusicGroup ;
>>      name "Jonathon 'Boogie' Long" ; ] ;
>>    :eventStatus :EventRescheduled ;
>>    :previousStartDate "2013-09-30T19:30"^^xsd:dateTime .
>
> If the graph had a property of the MusicGroup bnode pointing back to
> the MusicEvent (e.g. 'performance', or 'event' in some other well
> known RDF vocabulary), then it would cease to be "well behaved", on
> your definition, since the bnode-infected portion of the graph lacks
> URIs. You'd rather there were URIs on those nodes, that's clear. But
> if they are to be bNodes, you'd really prefer this hypothetical
> reverse-direction property to be removed from the graph than pollute
> it?

No, that's not what I'm suggesting.  Certainly if the choice is between 
more complete data with a non-well-behaved blank node and less complete 
data, it would be more helpful to publish the more complete data.  But 
that's a false dichotomy.

I'm suggesting the wide-spread adoption of a simpler profile of RDF that 
allows implicit blank nodes but disallows explicit blank nodes -- Well 
Behaved RDF -- in order to simplify RDF processing.

To add the property you're suggesting to the above example, while still 
conforming to Well Behaved RDF, either the MusicEvent node or the 
MusicGroup node could be assigned a URI instead of a blank node.  All 
the other blank nodes could remain as implicit blank nodes.

One reason why this is important is that any significant software effort 
needs to do regression testing.  Regression testing with virtually any 
other data representation is easy: just run cmp on the two files, and 
see if they differ.  For most kinds of data, the same software will 
serialize the same data the same way.  If it doesn't, due to random 
differences in hashing, etc., then it is usually easy enough to first 
serialize the data in a canonical form, such as sorting it or such.  But 
with RDF having unrestricted blank nodes the task is made ridiculously 
more difficult.

Honestly, if some developer came to me proposing a fantastic new data 
representation that was promised to be the greatest thing since Unicode, 
but it still had one flaw: you couldn't compare two files in that 
representation for "equality" without potentially solving an NP-complete 
problem, I'd say forget it -- come back when the design is finished.

Certainly I can work around this problem (even if I do curse under my 
breath while doing so), and I'm sure everyone else on this list can too. 
  But the people on this list are *not* average software developers. 
They're the elite of the elite of RDF *experts*.  RDF is *not* so easy 
for average developers.

As everyone on this list should already know, I'm a strong advocate for 
RDF.  But from experience I'm also coming to the conclusion that RDF is 
still harder than it should be (and needs to be), and I think that is 
significantly hindering adoption.  Kingsley advocates more and better 
education about RDF -- and certainly that can help -- but after 10 years 
of explaining RDF I think the problem is more fundamental than 
inadequate messaging or education.  I think we have not yet designed RDF 
to be simple enough.  The simple parts are indeed simple -- triples as 
assertions, etc. -- but the subtle complexities are still too hard.

David

Received on Monday, 28 July 2014 04:46:38 UTC