Re: Publishing data/ontologies: release early & often, or get them right first? from Giovanni Tummarello on 2008-03-15 (semantic-web@w3.org from March 2008)

From: Giovanni Tummarello <giovanni.tummarello@deri.org>
Date: Sat, 15 Mar 2008 15:31:43 +0000
To: "Valentin Zacharias" <Zacharias@fzi.de>
Cc: "Danny Ayers" <danny.ayers@gmail.com>, "semantic-web at W3C" <semantic-web@w3c.org>
Message-ID: <210271540803150831r4f2229d2y5afeefe5ae436a37@mail.gmail.com>
I really see no other solution than coming up with a boring but needed
methodology for practically versioning ontologies, and automatically
migrating the data.

when a new version is created, it should be reflected in the URIs of
probably every property. Mappings to the previous versions should be
provided in form of sparql construct queries or pointing a something
like a "semantic web pipe" (have sparql construct as operators ) [1]
or whatever method.

Implementing software should then simply perform such steps when
fetching the data.

is anybody aware of one such existing dictionary specifically for this
purpose? if not we should create it right away, accept the extra - but
inevitable imo - complication and move on.

Giovanni

On Sat, Mar 15, 2008 at 10:18 AM, Valentin Zacharias <Zacharias@fzi.de> wrote:
>
>
>  really interesting and important question!
>
>  I believe that with the way ontologies are used in todays applications[1],
>  changing or removing elements from an ontology will break applications; e.g.
>  rename foaf:mbox to foaf:mailbox and hundreds of applications will need
>  changing. It is for this reason that you cannot use agile development
>  processes for ontologies that are really used in a distributed setting.
>
>  Designing ontologies for this settings is very similar to designing public
>  APIs, and I recommend everyone to watch Joshua Blochs excellent presentation
>  on How to Design a good API [2] - a lot of what he says applies to
>  ontologies as well. In particular he talks about the development process of
>  the API [ontology] before it becomes public - and here you can and should
>  use agile processes. He also proposes to code against the API [ontology]
>  before the API [ontology] is implemented; to create three programs that use
>  the API [ontology] before it becomes public.
>
>  Having said that, ontologies do make it simple to allow a certain degree of
>  flexibility - such as adding additional attributes. If the people using the
>  ontology are aware that a certain part of the ontology is subject to many
>  changes, it is also relatively easy to create programs that can tolerate
>  this. For example a few years back we created an Event Ontology [3] that had
>  a stable core with things such as time, location etc. and a event category
>  intented to be extended decentrally (e.g. need a SpeedcoreDanceParty
>  category with an attribute average beats per minute? Just add it!). People
>  writing software exchanging information with this ontology would know to be
>  careful to accomodate for unknown subclasses of event category (and could
>  fall back to treat an event of an unknown category as an generic event with
>  a full text description and all the attributes and relations from the
>  reliable core). There are many things wrong with this Event Ontology from
>  back then, but I believe this pattern of (domain dependent) known and
>  controlled flexibility is a good one.
>
>  As for ontology evolution, a solution could be to make it possible to
>  retrieve an executable mapping from any version of an ontology to any other
>  version - but I'm not aware of any onlogy evolution or versioning solution
>  that does this (and its not even possible in general).
>
>  cu
>
>  [1]: If you like, there is an article discussing whether this way of using
>  ontologies is the right way in the Jan/Feb issue of IEEE Internet Computing:
>  http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=4428341
>  [2]: http://www.infoq.com/presentations/effective-api-design
>  [3]: http://www.valentinzacharias.de/papers/ZachariasSibler2004.pdf
>
>  --
>  email: zacharias@fzi.de
>  phone: +49-721-9654-806
>  fax  : +49-721-9654-807
>  http://www.vzach.de/blog
>
>  =======================================================================
>  FZI  Forschungszentrum Informatik an der Universität Karlsruhe (TH)
>  Haid-und-Neu-Str. 10-14, 76131 Deutschland, http://www.fzi.de
>  SdbR, Az: 14-0563.1 Regierungspräsidium Karlsruhe
>  Vorstand: Rüdiger Dillmann, Michael Flor, Jivka Ovtcharova, Rudi Studer
>  Vorsitzender des Kuratoriums: Ministerialdirigent Günther Leßnerkraus
>  =======================================================================
>
>
> -----Original Message-----
>  From: semantic-web-request@w3.org on behalf of Danny Ayers
>  Sent: Fri 3/14/2008 11:27 PM
>  To: semantic-web at W3C
>  Subject: Publishing data/ontologies: release early & often, or get them
>  right first?
>
>
>  The other day, in conversation with Richard Cyganiak (recorded at
>  [1]), I paraphrased something timbl had mentioned (recorded at [2]) :
>  when expressing data as RDF on the Web, it's possible to make a rough
>  guess at how the information should appear, and over time
>  incrementally/iteratively improve its alignment with the rest of the
>  world.
>
>  I was upbeat on this (and my paraphrasing probably lost a lot of
>  timbl's intent) because personally when doing things with RDF I find
>  it hugely advantageous over a traditional SQL RDMS approach simply
>  because you can be more agile - not getting your schema right first
>  time isn't an obstacle to development. But the stuff I play with
>  (which I do usually put on the web somewhere) isn't likely to develop
>  a forward chain of dependencies.
>
>  But Richard pointed out (no doubt badly paraphrasing again) that the
>  making of statements when published on the Web brought with it a level
>  of commitment. I don't think he used these words, but perhaps a kind
>  of responsibility. The example he gave of where problems can arise was
>  DBpedia - the modelling, use of terms, is revised every couple of
>  months or so. Anyone who built an app based on last months vocab might
>  find the app broken on next month's revision. I think Richard had
>  properties particularly in mind - though even when Cool URIs are
>  maintained, might not changes around connections to individuals still
>  be problematic?
>
>  So I was wondering if anyone had any thoughts on how to accomodate
>  rapid development (or at least being flexible over time) without
>  repeatedly breaking consuming applications. How deep does our
>  modelling have to go to avoid this kind of problem? Can the versioning
>  bits of OWL make a significant difference?
>
>  Or to turn it around, as a consumer of Semantic Web data, how do you
>  avoid breakage due to changes upstream? Should we be prepared to
>  retract/replace whole named graphs containing ontologies, do we need
>  to keep provenance for *everything*?
>  I suspect related - if we have a locally closed world, where do we put
>  the boundaries?
>
>  Cheers,
>  Danny.
>
>  [1]
>  http://blogs.talis.com/nodalities/2008/03/a_chat_with_richard_cyganiak.php
>  [2] http://blogs.zdnet.com/semantic-web/?p=105
>
>  --
>  http://dannyayers.com
>  ~
>  http://blogs.talis.com/nodalities/this_weeks_semantic_web/
>
>
>
>
Received on Saturday, 15 March 2008 15:32:20 UTC