W3C home > Mailing lists > Public > public-sweo-ig@w3.org > March 2007

Integration demonstrator? (was Re: RDF and SPARQL derivative of THALIA)

From: Danny Ayers <danny.ayers@gmail.com>
Date: Sat, 31 Mar 2007 10:29:51 +0200
Message-ID: <1f2ed5cd0703310129x315e343dkb7df6acf912960ea@mail.gmail.com>
To: "W3C SWEO IG" <public-sweo-ig@w3.org>
Cc: kidehen@openlinksw.com, "Chris Bizer" <chris@bizer.de>, "Orri Erling" <oerling@openlinksw.com>, "Sören Auer" <auer@seas.upenn.edu>, "Richard Cyganiak" <richard@cyganiak.de>, "Stefano Mazzocchi" <stefanom@mit.edu>, "Huajun Chen" <huajunsir@gmail.com>

Hi Kingsley, passing this onto SWEO for comment, with a little
proposal for action -

[[
Please take a look at:

http://www.cise.ufl.edu/project/thalia.html

I think we should consider collectively producing an RDF and SPARQL
derivative of THALIA.

The end result would be a great showcase for the data integration
prowess of RDF and the Semantic Web oriented tools that address these
integration challenges.
]]

This is timely given Stefano's recent post regarding the problem of
integration in the global environment:

http://www.betaversion.org/~stefano/linotype/news/101/

(bah, currently down - maybe Slashdot having a field day...)

The fact that independently developed models may not naturally align
when expressed as RDF/OWL is a core problem of the Semantic Web, and a
potential target for criticism. It may even be an Achilles Heel (I'm
not convinced ;-) I seem to remember the general question cropping up
a lot on the Standard Upper Ontology list, it's certainly been hanging
in the RDF air at least since Ian's "Crisis" post:

http://iandavis.com/blog/2005/09/crisis

I believe this could be something SWEO might productively address. By
tackling it head on, not only would criticism be averted but (as
Kingsley suggests) showcase material generated. Given the current
workload I'd suggest this be provisionally put in the queue with low
priority at this point in time. For the same reason, I'd position it
closer to a Community Project (with most of the actual work being down
outside SWEO) than a Task Force.

A possible approach might be along the following lines, documenting every stage:

1. identify realistically diverse source data (the THALIA dataset
might do, I'm not sure)
2. identify a suitable methodology for deriving a domain ontology
3. create ontology
3. find and apply tools for expressing 1. in 2.
4. record issues encountered
5. overcome or workaround issues
6. publish results

As a precursor I'd suggest a good search of the literature, and
communication with SWBPD. Much of the work may already have been done
(I've seen plenty of papers on ontology mapping and alignment, but
can't remember anything which covers this full workflow in a pragmatic
fashion).

A first pass at this would mean an arbitrary choice of tools and
techniques, however with one set of results in place it could serve as
a challenge for other researchers and tool developers.

Given the (offlist) responses to Kingsley's mail it seems likely that
there is already interest in working on the problem. The task/project
would need a champion - Kingsley, if he's interested, would be the
ideal person given the substantial prior work on integration at
OpenLink.

This isn't far away from the Linking Open Data project, so (assuming
that team agree) I'd suggest using that mailing list for discussion,
along with appropriate pages on the ESW Wiki.

Thoughts?

Cheers,
Danny.


-- 

http://dannyayers.com
Received on Saturday, 31 March 2007 08:30:06 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 20:28:52 UTC