- From: Sebastian Hellmann <hellmann@informatik.uni-leipzig.de>
- Date: Mon, 10 Nov 2008 15:12:57 +0100
- To: public-xg-rdb2rdf@w3.org
Dear All, I recently reviewed the StateOfTheArt Wiki Page and found it puzzling at times. I already changed some parts: * Reorganized summary of literature survey. There were lots of entries in categories named "Other". I moved R2O, Sahoo et al. and Dartgrid to the Domain-Semantics section. R2O and ODEMapster are similar to D2RQ, which also is a mapping language. Both can be used manually to model Domain-Semantics. R2O even more so, since it requires a pre-existing Domain Ontology. Dartgrid is similar to Hu et. al. as it provides a visual aligment tool. The target of Sahoo et. al. is answering questions with the help of SPARQL, but the technique used is ETL (correct me if, I'm mistaken), so I also moved it to Domain-Semantics. The work of Chebotko is imho completely out of scope as he is concerned with SPARQL-to-SQL rewriting for triple stores, which are already in RDF. I think the reference can be removed completely. * I removed the table criteria Query Implementation, as it is misleading. It can be merged with mapping implementation. Some entries where of the form "static"(ETL) and had "SPARQL" as "query implementation". Once ETL is performed it can naturally be loaded in a triple store and queried with SPARQL, also an On-Demand Query-Driven approach can easily produce an RDF Dump. The main criteria here should be if the data is retrieved on the fly from the database or just transformed once. The "Data Integration" criteria for the table doesn't really distinguish much, since all approaches certainly aim at integrating data (into the Semantic Web). A more important criteria would be, if approaches 1. need a pre-exisiting ontology 2. go beyond database-semantics in the direction of domain-semantics or 3. if they are used in real projects, that successfully integrate more than one database. Proposal for classification of literature: There seem to be 4 classes, which the literature can be divided in: 1. Schema/ontology Alignment: Hu et al., Dartgrid. Both try to create an alignment from an DB-schema to an existing ontology. Related Work in this direction is very numerous just to mention Coma++[1]. 2. Database Mining Li, DB2OWL, RDBToOnto, Tirmizi all start from the existing database and try to extract as much information as possible from the database schema. They also stop there, which means they do not use any external sources such as existing domain ontologies. 3. Integration/Domain Semantics Sahoo et al. mainly concerned with modeling domain semantics correctly. 4. Languages/Servers D2RQ, R2O, RDF Views, Asio Tools, all have their own language and they all provide means to model domain semantics, but most often manually. Hope I could help. I can also offer to restructure the StateOfTheArt Wiki page myself, but didn't dare to make such changes on my own account. Regards, Sebastian Hellmann [1] http://dbs.uni-leipzig.de/de/Research/coma.html -- http://bis.informatik.uni-leipzig.de/SebastianHellmann
Received on Tuesday, 11 November 2008 15:26:12 UTC