Publishing data/ontologies: release early & often, or get them right first? from Danny Ayers on 2008-03-14 (semantic-web@w3.org from March 2008)

From: Danny Ayers <danny.ayers@gmail.com>
Date: Fri, 14 Mar 2008 23:27:03 +0100
To: "semantic-web at W3C" <semantic-web@w3c.org>
Message-ID: <1f2ed5cd0803141527w19e4599k9ddf29f033c36717@mail.gmail.com>

The other day, in conversation with Richard Cyganiak (recorded at
[1]), I paraphrased something timbl had mentioned (recorded at [2]) :
when expressing data as RDF on the Web, it's possible to make a rough
guess at how the information should appear, and over time
incrementally/iteratively improve its alignment with the rest of the
world.

I was upbeat on this (and my paraphrasing probably lost a lot of
timbl's intent) because personally when doing things with RDF I find
it hugely advantageous over a traditional SQL RDMS approach simply
because you can be more agile - not getting your schema right first
time isn't an obstacle to development. But the stuff I play with
(which I do usually put on the web somewhere) isn't likely to develop
a forward chain of dependencies.

But Richard pointed out (no doubt badly paraphrasing again) that the
making of statements when published on the Web brought with it a level
of commitment. I don't think he used these words, but perhaps a kind
of responsibility. The example he gave of where problems can arise was
DBpedia - the modelling, use of terms, is revised every couple of
months or so. Anyone who built an app based on last months vocab might
find the app broken on next month's revision. I think Richard had
properties particularly in mind - though even when Cool URIs are
maintained, might not changes around connections to individuals still
be problematic?

So I was wondering if anyone had any thoughts on how to accomodate
rapid development (or at least being flexible over time) without
repeatedly breaking consuming applications. How deep does our
modelling have to go to avoid this kind of problem? Can the versioning
bits of OWL make a significant difference?

Or to turn it around, as a consumer of Semantic Web data, how do you
avoid breakage due to changes upstream? Should we be prepared to
retract/replace whole named graphs containing ontologies, do we need
to keep provenance for *everything*?
I suspect related - if we have a locally closed world, where do we put
the boundaries?

Cheers,
Danny.

[1] http://blogs.talis.com/nodalities/2008/03/a_chat_with_richard_cyganiak.php
[2] http://blogs.zdnet.com/semantic-web/?p=105

-- 
http://dannyayers.com
~
http://blogs.talis.com/nodalities/this_weeks_semantic_web/

Received on Friday, 14 March 2008 22:27:38 UTC