Re: URIs and Unique IDs from Conor Shankey on 2008-11-04 (semantic-web@w3.org from November 2008)

From: Conor Shankey <cshankey@reinvent.com>
Date: Tue, 04 Nov 2008 10:20:02 -0800
To: "Michael Lang(Jr.)" <michaelallenlang@gmail.com>
CC: Peter Ansell <ansell.peter@gmail.com>, John Graybeal <graybeal@mbari.org>, Michael F Uschold <uschold@gmail.com>, semantic-web@w3.org, aldo gangemi <aldo.gangemi@gmail.com>, Peter Mika <pmika@yahoo-inc.com>, Ora Lassila <ora.lassila@nokia.com>, "Dr Jeff Z. Pan" <jeff.z.pan@abdn.ac.uk>, Tim Berners-Lee <timbl@csail.mit.edu>, Frank van Harmelen <Frank.van.Harmelen@cs.vu.nl>, sean bechhofer <sean.bechhofer@manchester.ac.uk>, michaelalang@gmail.com
Message-ID: <49109252.5090004@reinvent.com>
I strongly disagree that versioning will not be important. I suspect 
that it will become the most profound and challenging problem to tackle 
if we are to scale the application of semantic technology. Change 
management is a less critical in the short term for those concerned with 
the linguistic notion of semantics. However, if you are concerned with 
leveraging semantic models to drive/support high value proposition 
mission critical systems, change management becomes a serious concern. 
Versioning and change management becomes a show stopper if you are going 
even further and intend to create full computational semantic systems 
where the algorithms and data/object models of software systems are 
replaced by semantic models. In each one of these three areas the level 
of trust and dependencies on the asserted semantics will become critical.

Here are a few examples:

1. Trust semantic models or ontologies to support operational/mission 
systems such as:
    a. Equipment, system maintenance applications
          - an knowledge modeler/ontologists asserted that a General 
Electric A877623 is a subclass of a Turbo Prop Engine and then in a 
later version realizes their mistake that it is a subclass of another 
system. The difference affects the scheduling of maintenance for aircraft.
          - a similar model asserts that a system should be overhauled 
if a certain condition occurs
    b. Operational policies and compliance applications
          - a knowledge modelers asserts that a person who approve a 
credit rating cannot approve a loan but in a later version of the 
compliance ontology realizes that the semantics need to be far more 
sophisticated. The difference affects the ability of the compliance 
system to prevent or permit fraud.
    c. Medical / Bio applications
          - A bio medical ontologists asserts that one protein 
up-regulates a gene. Another subject matter expert asserts that the same 
protein down regulates a gene. Another researchers realizes that it is 
important to tear down the model and express the context of the scenario 
to capture the conflict. The difference affects the ability of a medical 
diagnostic system.
    d. Intelligence systems
          - The model of a social / economic network for terrorist in 
one model needs to be advanced to not to create millions of false positives.
    e. Any other system that dreams of integrating vast amounts of 
subject matter expertise and organizing into something more 
sophisticated and operational than just a categorization system, 
dictionary or primitive taxonomy.

2. Simple, but ontologies/semantic models with massive adoption
    a. In one popular social networking ontology the class Person is 
used by millions of people. Later it becomes critical to redefine the 
class as a subclass of Social Contact in order to differentiate from the 
animal or physical notion of Person in another widely used ontology.

3. In the longer term vision, semantic technology Drive model driven / 
ontology driven software systems
    a. Declarative, rich semantic models that explicitly describe the 
behavour of parts or every aspect of a functional software system.
    b. Models that explicitly express the compatibility semantics 
between one software system and another so that software systems 
actually understand their purpose and functionality.

Systems that are more concerned with the NLP or the linguistic notion of 
"semantics" are currently a little bit more resilient to change 
management because their application tend to use statistics or 
approximation to create value. Example applications would be sense 
disambiguation for advertising, entity extraction, etc.. For these 
systems machine learning can help us cope with a lot of inconsistencies 
in semantic models. However, as these systems will become more mission 
critical and the rationalization and harmonization of semantics between 
various ontologies will start to become a serious economic issue. Using 
the right version of various semantic models (such as Wordnet, DBPedia, 
etc..) will become a very challenging and painful problem. This latter 
area is a significant concern and area of effort/management right now.

The power of semantics can permit us to formally express and share the 
semantics of things explicitly or implicitly. This can ultimately help 
to actually get a grip on the ugly world of change management. However, 
in the short term it will open a Pandora's box of power and change 
management problems.

Conor

Conor Shankey
CTO
Reinvent, Inc - Vancouver.com
www.Reinvent.com
www.Vancouver.com

Michael Lang(Jr.) wrote:
> Peter,
>
> I agree 100% with your assessment.  In the semantic web world, I 
> believe that versioning will not be very important.  I think a major 
> benefit of using semantic web technologies is that you can build an 
> application that will adapt to changes in the semantics of a word as 
> the semantics change in the real world.  
>
> But, as you said, there may be cases where, at a significant point in 
> time, a community would like to version its vocabulary.  The goal of 
> this discussion is simply to develop some guidelines for versioning, 
> when it is necessary, that will make the transition from a past 
> version of a vocabulary to an new one as easy, accurate, and flexible 
> as possible for the users of a vocabulary. 
>
> Mike Lang
>
> On Tue, Nov 4, 2008 at 1:41 AM, Peter Ansell <ansell.peter@gmail.com 
> <mailto:ansell.peter@gmail.com>> wrote:
>
>
>     ----- "John Graybeal" <graybeal@mbari.org
>     <mailto:graybeal@mbari.org>> wrote:
>
>     > On Nov 3, 2008, at 10:48 AM, Michael Lang(Jr.) wrote:
>     >
>     > >  I strongly believe (and it seems that you and John agree)
>     that if a
>     >
>     > > UID for a concept changes, the old version must have some way of
>     > > pointing to the new version.
>     >
>     > Funny, I would have said this the other way around (new points
>     back to
>     > old, then the system services can provide the old -> new
>     capability --
>     > or is this what you are saying too?).  I have this notion that *any*
>     > change to a static resource's specifications -- definition,
>     metadata,
>     > semantics -- makes a new resource (this lets me compare resource_new
>     > to resource_old and see the difference between them unambiguously).
>     >
>     > With this vision, the resource can't change once it is created, even
>     > to point to a new resource (you see the problem).  Is this
>     vision just
>     > plain wrong, per the consensus?
>
>     Should we really focus on a "ya just never know, do ya" philosophy
>     that hurts the majority of casual users more than it helps the
>     specialised users? If you make up a system where you require that
>     people manually migrate all their past statements in order to use
>     the system in a months time then you won't be looked upon too
>     favourably. And if you give them the choice to mass migrate their
>     statements then what is the point if they always select "migrate
>     all to most current versions"?
>
>     This is a very radical discussion that I don't think fits the
>     majority of use cases that the semantic web will be applied to, as
>     it is decidedly anti Web-2.0 where there is a constant evolution
>     and links are relative, not static as in Web-1.0. If you really
>     face it, meaning migrates, and the particular structure at a given
>     instant in time isn't as relevant as the improvement in meaning
>     anyway. If rules in the semantic web are completely reliant on
>     data structures and unable to recognise the overall meaning that
>     people gradually migrate towards then they are always going to be
>     brittle, whether people are perfectly pedantic about UID's and/or
>     URI's or whether they end up referencing everything with relative
>     addresses which don't focus on particular representations at
>     particular points in time.
>
>     It isn't bad to version information at significant points in time,
>     but the archaic once-published-always-published-never-modified
>     culture doesn't fit with electronic technologies IMO.
>
>     (Just a few thoughts :) )
>
>     Cheers,
>
>     Peter
>
>
>
>
> -- 
> Revelytix, Inc.
>
> phone: 410-584-0009 (office)
>           443-928-3782 (cell)
> skype: michael.allen.lang.jr
> aim: MikeJrRevelytix
Received on Wednesday, 5 November 2008 09:13:01 UTC