- From: Sebastian Hellmann <hellmann@informatik.uni-leipzig.de>
- Date: Thu, 20 Nov 2008 17:59:52 +0100
- To: "public-xg-rdb2rdf@w3.org" <public-xg-rdb2rdf@w3.org>
Hi all, some comments on RIF, the Initial XG Recommendation draft and D2RQ Lessons Learned [1] (section 3.1 People need weird mappings and 3.2 Mapping to RDF is not enough. People want data integration). I wondered if it made sense to redefine the term "mapping" language, so there would be 2 different language a "from/selection"-language, which is concerned with selecting parts from the RDB and a "to/target"-language, which is concerned with assembling the RDF and thus would be the "mapping language". The idea I have in mind is using a language native to the relational data store for selecting such as SQL or Datalog and then start from there with the mapping language to model RDF. Triplify, of course, uses this approach, but it has some limits regarding the expressivity of the output. The whole mapping process could look like this: An SQL-query yields a list of rows (or a list of named columns) as does "mydb:Customers( ?ID ?Name ?Phone ?Address)" from the RIF example (actually the SQL-query could be substituting this line ), the resulting columns are bound to the variables and then can be transformed by the standardized mapping language(the "to"-language) such as RIF: "External( pred:iri-string( ?T External( func:concat( "tel:" ?Phone )))" So if there is a decoupling of the selection and the mapping, some things become quite easy: 1. Integration: 2 or more databases can be integrated by first designing the mapping and then by implementing different SQL queries for each database (assumed the databases are different) 2. Reuse of mappings: As the mappings now are more geared towards the creation of RDF, the same mappings can and should be used for different RDBs. So e.g. Drupal, Wordpress, Typo3 use the same mappings, but different SQL-queries to fill the variables (maybe some more small adjustments, because of encoding or other problems). 3. There could be other input languages than SQL. Maybe Xpath or so. Anything that fills the variables or some other interface. Related to the XG Recommendation draft, the points "complete when compared to the relational algebra" and "must expose vendor specific SQL features" could be ticked off. As I'm not familiar with RIF, I'm not sure if something like that can be incorporated. I recently talked to Michael Martin, who transformed a relational database for a web application[2] (using ETL). As he needed to model some complex domain semantics and (in accordance to D2RQ[1] 3.1 People need weird mappings) as he really needed some weird mappings, he used SQL and PHP as a mapping language, which is quite a powerful language combination. This also allowed to correct encoding and filter some strange values. From my point of view, it would be necessary for producing a "clean" and good schema from an RDB to use SQL and a programming language (PHP/Java) as a selection language, then have it handed to the mapping language/processor with variables/via an interface. (I admit this last part is not easily realized and maybe goes too far. But as many evolutional databases are a mess, it would be nice to have options rather than workarounds around the mapping language). Regards, Sebastian Hellmann [1] http://www.w3.org/2007/03/RdfRDB/papers/d2rq-positionpaper/ [2] http://www.ceur-ws.org/Vol-301/Poster_5_Martin.pdf -- http://aksw.org/SebastianHellmann
Received on Thursday, 20 November 2008 17:00:33 UTC