W3C home > Mailing lists > Public > public-sweo-ig@w3.org > March 2007

Re: Integration demonstrator? (was Re: RDF and SPARQL derivative of THALIA)

From: Kingsley Idehen <kidehen@openlinksw.com>
Date: Sat, 31 Mar 2007 12:06:14 -0400
Message-ID: <460E86F6.7050505@openlinksw.com>
CC: W3C SWEO IG <public-sweo-ig@w3.org>, Chris Bizer <chris@bizer.de>, Sören Auer <auer@seas.upenn.edu>, Richard Cyganiak <richard@cyganiak.de>, Stefano Mazzocchi <stefanom@mit.edu>, Huajun Chen <huajunsir@gmail.com>, Frederick Giasson <fred@fgiasson.com>

Danny Ayers wrote:
>
> Hi Kingsley, passing this onto SWEO for comment, with a little
> proposal for action -
Okay.
>
> [[
> Please take a look at:
>
> http://www.cise.ufl.edu/project/thalia.html
>
> I think we should consider collectively producing an RDF and SPARQL
> derivative of THALIA.
>
> The end result would be a great showcase for the data integration
> prowess of RDF and the Semantic Web oriented tools that address these
> integration challenges.
> ]]
>
> This is timely given Stefano's recent post regarding the problem of
> integration in the global environment:
>
> http://www.betaversion.org/~stefano/linotype/news/101/
>
> (bah, currently down - maybe Slashdot having a field day...)
>
> The fact that independently developed models may not naturally align
> when expressed as RDF/OWL is a core problem of the Semantic Web, and a
> potential target for criticism. It may even be an Achilles Heel (I'm
> not convinced ;-) I seem to remember the general question cropping up
> a lot on the Standard Upper Ontology list, it's certainly been hanging
> in the RDF air at least since Ian's "Crisis" post:
>
> http://iandavis.com/blog/2005/09/crisis
>
> I believe this could be something SWEO might productively address. By
> tackling it head on, not only would criticism be averted but (as
> Kingsley suggests) showcase material generated. Given the current
> workload I'd suggest this be provisionally put in the queue with low
> priority at this point in time. For the same reason, I'd position it
> closer to a Community Project (with most of the actual work being down
> outside SWEO) than a Task Force.
>
> A possible approach might be along the following lines, documenting 
> every stage:
>
> 1. identify realistically diverse source data (the THALIA dataset
> might do, I'm not sure)
> 2. identify a suitable methodology for deriving a domain ontology
> 3. create ontology
> 3. find and apply tools for expressing 1. in 2.
> 4. record issues encountered
> 5. overcome or workaround issues
> 6. publish results
>
> As a precursor I'd suggest a good search of the literature, and
> communication with SWBPD. Much of the work may already have been done
> (I've seen plenty of papers on ontology mapping and alignment, but
> can't remember anything which covers this full workflow in a pragmatic
> fashion).
>
> A first pass at this would mean an arbitrary choice of tools and
> techniques, however with one set of results in place it could serve as
> a challenge for other researchers and tool developers.
>
> Given the (offlist) responses to Kingsley's mail it seems likely that
> there is already interest in working on the problem. The task/project
> would need a champion - Kingsley, if he's interested, would be the
> ideal person given the substantial prior work on integration at
> OpenLink.
I have no choice re. this matter :-)
Data Integration issues simply won't leave me alone :-)
>
> This isn't far away from the Linking Open Data project, so (assuming
> that team agree) I'd suggest using that mailing list for discussion,
> along with appropriate pages on the ESW Wiki.
My aim is to move this effort into the Linking Open Data project. This 
is an integral part of what this effort is about. It is also a nice 
segue to the Enterprise Community where the Data Integration challenges 
best understood. The use of RDF to alleviate Data Integration Challenges 
is pretty much the  essence of Semantic Web's value proposition to the 
Enterprise community.

The THALIA project is XML and XQuery oriented at the current time. We 
need to make it RDF and SPARQL oriented by way of enhancement. Once this 
is achieved we have an objective foundation clarify what to use when, 
and under what circumstances etc..

BTW - I have also reached out to the XBRL community re. a collaborative 
effort to produce OWL Ontologies from XBRL taxonomies [1] (currently 
expressed in XML Schema e.g the General Ledger Taxonomy). As is the case 
with THALIA, the responses from the key XBRL players [2] has also be 
very positive. Thus, I've asked Frederick Giasson pick this up in 
similar vein to the very success Music Ontology project [3].

Links:
1. http://en.wikipedia.org/wiki/XBRL
2. 
http://conference.xbrl.org/proceedings/2005/explaining_the_gl_taxonomy 
(Eric Cohen, Global XBRL Technical Leader for PricewaterhouseCoopers)
3. http://pingthesemanticweb.com/ontology/mo/ (Music Ontology)


Kingsley
>
> Thoughts?
>
> Cheers,
> Danny.
>
>


-- 


Regards,

Kingsley Idehen	      Weblog: http://www.openlinksw.com/blog/~kidehen
President & CEO 
OpenLink Software     Web: http://www.openlinksw.com
Received on Saturday, 31 March 2007 16:06:19 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 20:28:52 UTC