W3C home > Mailing lists > Public > public-xg-rdb2rdf@w3.org > January 2009

Requirements for Relational to RDF mapping document comment...

From: Ezzat, Ahmed <Ahmed.Ezzat@hp.com>
Date: Sat, 3 Jan 2009 03:31:39 +0000
To: "public-xg-rdb2rdf@w3.org" <public-xg-rdb2rdf@w3.org>
Message-ID: <3B7AE9BA67C72B4891EF21842246A21C4049841E9C@GVW1097EXB.americas.hpqcorp.net>

Hello,

Good paper.  Below are couple of comments:

1.      Motivation:
>> RDF offers a systematic ontology and query language which can be used against the mapped data, without concern of the semantic heterogeneity inherent in independently arisen relational databases.

AKE>  How about all kind of issues between mapping rdbms schema to the application/domain ontology? As well as reconciling the different local ontologies.  Even with only RDBMS data sources, having unified consistent and complete single data view (MDM) is a lot of work.   RDF does not help in any of these issues.  The author might want to rewrite/soften this sentence.

2.      Relative Desirability of Mapping and ETL:
>> We expect cases favoring ETL to be characterized by:
*       Large number of heterogeneous sources of data
*       Complex application logic needed for transforming the data
*       DRF reasoning being performed on the mapped data
*       Queries with variable in class or predicate positions

AKE> Agree on some, not  on some, and there are missing bullets.   For example, I am not sure why ETL (dump approach) is preferred with large number of data sources?  I suspect a main factor is scalability, i.e., if the overall aggregate data size from the large number of data sources is hard to accommodate in a single RDF store; it might force you to querying the data sources rather than translating all of them into an RDF store then querying.  Second factor is the dynamic nature of the data sources; with highly dynamic content I suspect querying the data sources is better.

3.      Mapping on Demand:
>> The Union Bomb:

AKE> I am not sure why this is a problem under "Mapping on Demand?"   It seems to be the same issues between ETL and on-demand mapping.
P.S. Most EIS (CRM like Vignette or ERP like SAP) servers do not expose SQL interface.  I assume you meant conceptually in the context of multiple databases.

4.      Criteria of Success:
>> At the end.... There should exist at least two interoperable implementations of the mapping language providing at least ETL.  Aside from this, implementers are encouraged to support on-demand mapping.

AKE> No, we need to support both approaches as 1st class citizens.  I view on-demand as more critical as having minimal cost for translation is more critical than the ETL approach.   Second, with either ETL or on-demand we need to have a proof of concept or recommendation for how to reconcile RDF sub-graphs out of multiple data sources into a single domain ontology.
Regards,

Ahmed


Ahmed K. Ezzat, Ph.D.
HP Fellow, Business Intelligence Software Division
Hewlett-Packard Corporation
11000 Wolf Road, Bldg 42 Upper, MS 4502, Cupertino, CA 95014-0691
Office: Email: Ahmed.Ezzat@hp.com<mailto:Ahmed.Ezzat@hp.com> Tel: 408-447-6380 Fax: 1408796-5427 Cell: 408-504-2603
Personal: Email: AhmedEzzat@aol.com<mailto:AhmedEzzat@aol.com> Tel: 408-253-5062 Fax: 408-253-6271
Received on Saturday, 3 January 2009 03:33:32 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Saturday, 3 January 2009 03:33:33 GMT