- From: David Booth <david@dbooth.org>
- Date: Mon, 28 Jul 2014 00:46:06 -0400
- To: Dan Brickley <danbri@google.com>
- CC: Pat Hayes <phayes@ihmc.us>, Kingsley Idehen <kidehen@openlinksw.com>, public-rdf-comments Comments <public-rdf-comments@w3.org>
On 07/27/2014 12:29 PM, Dan Brickley wrote: > On 26 July 2014 03:22, David Booth <david@dbooth.org> wrote: [ . . . ] >> That's a good example of how the vast majority of RDF does not need explicit >> blank nodes. The above can be written just fine as Well Behaved RDF using >> only implicit blank nodes: >> >> [] a :MusicEvent ; >> :name "B.B. King with Jonathon 'Boogie' Long" ; >> :location [ >> a :Place ; >> :address [ a :PostalAddress >> :streetAddress "..." ; >> :postalCode "..." . ] . >> ] ; >> :performer [ a :MusicGroup ; >> name "Jonathon 'Boogie' Long" ; ] ; >> :eventStatus :EventRescheduled ; >> :previousStartDate "2013-09-30T19:30"^^xsd:dateTime . > > If the graph had a property of the MusicGroup bnode pointing back to > the MusicEvent (e.g. 'performance', or 'event' in some other well > known RDF vocabulary), then it would cease to be "well behaved", on > your definition, since the bnode-infected portion of the graph lacks > URIs. You'd rather there were URIs on those nodes, that's clear. But > if they are to be bNodes, you'd really prefer this hypothetical > reverse-direction property to be removed from the graph than pollute > it? No, that's not what I'm suggesting. Certainly if the choice is between more complete data with a non-well-behaved blank node and less complete data, it would be more helpful to publish the more complete data. But that's a false dichotomy. I'm suggesting the wide-spread adoption of a simpler profile of RDF that allows implicit blank nodes but disallows explicit blank nodes -- Well Behaved RDF -- in order to simplify RDF processing. To add the property you're suggesting to the above example, while still conforming to Well Behaved RDF, either the MusicEvent node or the MusicGroup node could be assigned a URI instead of a blank node. All the other blank nodes could remain as implicit blank nodes. One reason why this is important is that any significant software effort needs to do regression testing. Regression testing with virtually any other data representation is easy: just run cmp on the two files, and see if they differ. For most kinds of data, the same software will serialize the same data the same way. If it doesn't, due to random differences in hashing, etc., then it is usually easy enough to first serialize the data in a canonical form, such as sorting it or such. But with RDF having unrestricted blank nodes the task is made ridiculously more difficult. Honestly, if some developer came to me proposing a fantastic new data representation that was promised to be the greatest thing since Unicode, but it still had one flaw: you couldn't compare two files in that representation for "equality" without potentially solving an NP-complete problem, I'd say forget it -- come back when the design is finished. Certainly I can work around this problem (even if I do curse under my breath while doing so), and I'm sure everyone else on this list can too. But the people on this list are *not* average software developers. They're the elite of the elite of RDF *experts*. RDF is *not* so easy for average developers. As everyone on this list should already know, I'm a strong advocate for RDF. But from experience I'm also coming to the conclusion that RDF is still harder than it should be (and needs to be), and I think that is significantly hindering adoption. Kingsley advocates more and better education about RDF -- and certainly that can help -- but after 10 years of explaining RDF I think the problem is more fundamental than inadequate messaging or education. I think we have not yet designed RDF to be simple enough. The simple parts are indeed simple -- triples as assertions, etc. -- but the subtle complexities are still too hard. David
Received on Monday, 28 July 2014 04:46:38 UTC