W3C home > Mailing lists > Public > public-xg-rdb2rdf@w3.org > March 2008

literature, tools, tasks

From: Sebastian Hellmann <kurzum@googlemail.com>
Date: Sun, 09 Mar 2008 17:52:27 +0100
Message-ID: <47D415CB.1010609@googlemail.com>
To: public-xg-rdb2rdf@w3.org

my name is Sebastian Hellmann, I'm about to finish my Master and will be
a PhD Student at the AKSW Group Leipzig [1], soon . I'm currently doing
some research in the db2owl direction. I spent the last couple of days
making a large list of literature concerning mapping and conversion,
which could be of some use [2][Y][Z].
IMHO the standardization process can be divided in separate tasks, which
clearly depend on the way the mapping/conversion will be used:

1. uni-directional mapping RDB2RDF/OWL
One of the most often read sentences, I came across during research was:
"There is a lot of information 'hidden' in relational databases." and it
  seemed that it often would be sufficient to reveal that data and pour
it into the Semantic Web. This should be done with a very lightweight
approach, so that the average MySQL admin can participate. If Jemima
Kiss [3] is right and the Web 3.0 from a non-technical perspective is
all about rank, recommendation and personalization, then it should be
possible to connect EVERY Web account any person has with his other Web
accounts. Ignoring the problems like trust/security/privacy on this
issue, there should be a way a user can connect his identity on a phpbb
forum [4] with his identity on answers.yahoo.com [5] with his identity
on a self-knitted PHP-MySQL Web app of his friend. Although the charter
of this group states that there should not be a default mapping, I
actually think there should be at least a recommendation or best
practice on how to map the basics, e.g. a table "user" to FOAF, or at
least a selection of standard vocabularies[FOAF, SIOC] and LinkedData
[DBpedia, product catalog, revyu] for this simple scenario. There are
already some tools (possibly quite a few), that provide fast access and
querying, i.e. D2R[6], SquirrelRDF[7] and Sören Auer's  Triplify[8].
With a proper standardization these tools could go one simple
step further, which would help achieving a critical mass for the
Semantic Web.

2. merging/mapping one or several databases to one RDF/OWL Schema
Providing basic CRUD operations, keeping the old database(s).
Most of the problems and suggestions have been stated in an earlier post
by Andrew Matthews:
Tools I found for this: Dartgrid [9], Coma++[10]

3. converting a relational database to RDF/OWL, no RDB any more
It would be beneficial for some applications, e.g. which have a strong
need for structuring/reasoning, to completely migrate to RDF/OWL. So the
question here should be how is information stored in RDB and how is it
stored in Ontologies. The RelDB Scheme normally needs to be transformed
by an engineer, because OWL can have richer schemes, like in Reverse
Engineering to Object-oriented databases. The main issue here is maybe
to help create a good ontology containing all nice/good information and
not containing weakly defined/unclean information from the RDB.
There is a paper from Beijing [11], which proposes 11 rules, which seem
to make sense. (I could not find a link to the tool they write about, it
is called CODE, please mail me, if you find it). At our group, we have 
an Ontology Enrichment Tool, i.e. Jens Lehmann's DL-Learner [12], which 
could be used to enrich an initial schema extracted from a RDB to ease 
the burden on an engineer.

Basically I tried to give a quick overview of how I see the
problems/tasks that are to face. I'm more interested in the 1. and the
3. part, because we are working on this in one way or another. One could
argue though that this incubator group should be concerned with the 2nd 
part only, because the 1. part might be too easy and there will be a lot 
of tutorials for this all over the net, which will reach a consensus 
sooner or later all by themselves and (3.) any company who wants to 
engage in the 3. task will need help  by a professional ontology 
engineer anyhow, making standardization unnecessary.

Hope I could help at least with my literature list,

[1] http://aksw.org/About
[2] http://bibsonomy.org/user/sebastian/dbowl
[3] http://www.guardian.co.uk/media/2008/feb/04/web20
[4] http://www.phpbb.com/
[5] http://answers.yahoo.com/
[6] http://www4.wiwiss.fu-berlin.de/bizer/D2RQ/
[7] http://jena.sourceforge.net/SquirrelRDF/
[8] http://triplify.org/About
[9] http://esw.w3.org/topic/DartGrid
[10] http://dbs.uni-leipzig.de/de/Research/coma.html
[11] http://dblp.uni-trier.de/db/conf/waim/waim2005.html#LiDW05
[12] http://aksw.org/Projects/DLLearner

**not so useful due to amount:
delicious links about everything I found:
[Y] http://del.icio.us/kurzum/dbowl
delicious links about implemented tools:
[Z] http://del.icio.us/kurzum/dbowl%2Btools
Received on Monday, 10 March 2008 03:08:56 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 10 March 2008 03:08:57 GMT