W3C home > Mailing lists > Public > public-xg-rdb2rdf@w3.org > January 2009

Re: Requirements for Relational to RDF mapping document comment...

From: Kingsley Idehen <kidehen@openlinksw.com>
Date: Tue, 06 Jan 2009 11:12:29 -0500
Message-ID: <496382ED.2040209@openlinksw.com>
To: ashok.malhotra@oracle.com
CC: Li L Ma <malli@cn.ibm.com>, "Ezzat, Ahmed" <Ahmed.Ezzat@hp.com>, "public-xg-rdb2rdf@w3.org" <public-xg-rdb2rdf@w3.org>, public-xg-rdb2rdf-request@w3.org

On 1/6/09 10:10 AM, ashok malhotra wrote:
> Hello Li Ma:
> You said...
> *> Besides integration, I think, another significant value of RDB2RDF
> mapping for
>> enterprise data management is to enable ontology and rule reasoning
> for analytics > services.
> I agree. It would be good to add a usecase that enables rules and reasoning.
> Could you amplify your discussion below with an example of a rule or
> some sort of reasoning that in enabled by the mapping to the Semantic
> Web? We can then add this as a third usecase.

Rules & Reasoning are areas that are easily misunderstood when it comes
to RDF related matters, most of the time the examples are over
complicated and ultimately detract from the point.

Here is a simple example of what I would encourage us to look at:

1. Produce a Data Source Ontology
2. Add subclasses and sub-properties to the ontology (maybe even add
class level equivalence statements)
3. Generate Inference Rules (in Virtuoso you would simply be associating
the ontology above with a Name Rule)
4. Show how SPARQL or HTML browsing of the instance data generated for
the ontology is enhanced by the Inference rules

The scenario above can be applied to all the demonstration databases
that accompany all the major RDBMS engines.

Note: you can perform analytics against any of these demonstration
database to convey the point with clarity and simplicity.

> Thanks!
> *
> All the best, Ashok
> Li L Ma wrote:
>> Hi Ezzat and all,
>> Happy New Year!
>> I agreed with your comments on RDF for MDM integration. So far, I also
>> did not see the effective use of RDF for master data integration.
>> Here, I'd like to share our research work on using linked data
>> techniques for master data management. The following picture shows our
>> high level ideas. We created a core ontology from a MDM logical model,
>> as well as a mapping (defined by D2RQ) between the created ontology
>> and MDM data stored in relational databases. That means we can have an
>> RDF view to existing master data. Furthermore, the published master
>> data could be linked/mapped to domain ontologies by rules. Once master
>> data is mapped to ontologies, users can define their own business
>> rules using classes and properties defined in core MDM and domain
>> ontologies and issue SPARQL queries including defined rules to MDM
>> databases. Our developed SeDA engine, which takes as input SPARQL
>> query, D2RQ mapping, ontolgies and user-defined business rules, can
>> translate a SPARQL query to a single SQL query to retrieve master
>> data. In summary, using some linked data technologies (mainly mapping
>> and reasoning), we provided advanced analytics services over
>> centralized master data, but NOT focusing on the integration problem
>> in MDM. An interesting problem to explore in the future is to use
>> linked data technologies for the registry based MDM implementation.
>> *Besides integration, I think, another significant value of RDB2RDF
>> mapping for enterprise data management is to enable ontology and rule
>> reasoning for analytics services. So, I think reasoning is also
>> important for both ETL and on demand mapping approaches.*
>> For "the Union Bomb", I think Orri proposed it from implementation
>> perspective. Compared with ETL approach, on demand mapping has higher
>> requirments for performance and scalability. If a mapping can provide
>> clues for query optimization, SPARQL-to-SQL engines can generate more
>> efficient SQL statements.
>> Best Regards,
>> Li MA, Ph.D
>> Manager, Semantic Technologies
>> IBM China Research Lab
>> TEL: 86-10-58748078
>> T/L: 11905 ext. 8078
>> FAX: 86-10-58748731
>> E-Mail: MaLLi@cn.ibm.com
>> Homepage: http://www.research.ibm.com/people/m/mali
>> *"Ezzat, Ahmed" <Ahmed.Ezzat@hp.com>*
>> Sent by: public-xg-rdb2rdf-request@w3.org
>> 2009-01-03 11:31
>> To
>> 	"public-xg-rdb2rdf@w3.org" <public-xg-rdb2rdf@w3.org>
>> cc
>> Subject
>> 	Requirements for Relational to RDF mapping document comment...
>> Hello,
>> Good paper. Below are couple of comments:
>> 1. _Motivation_:
>>>> RDF offers a systematic ontology and query language which can be
>> used against the mapped data, without concern of the semantic
>> heterogeneity inherent in independently arisen relational databases.
>> AKE> How about all kind of issues between mapping rdbms schema to the
>> application/domain ontology? As well as reconciling the different
>> local ontologies. Even with only RDBMS data sources, having unified
>> consistent and complete single data view (MDM) is a lot of work. RDF
>> does not help in any of these issues. The author might want to
>> rewrite/soften this sentence.
>> 2. _Relative Desirability of Mapping and ETL_:
>>>> We expect cases favoring ETL to be characterized by:
>>     * Large number of heterogeneous sources of data
>>     * Complex application logic needed for transforming the data
>>     * DRF reasoning being performed on the mapped data
>>     * Queries with variable in class or predicate positions
>> AKE> Agree on some, not on some, and there are missing bullets. For
>> example, I am not sure why ETL (dump approach) is preferred with large
>> number of data sources? I suspect a main factor is scalability, i.e.,
>> if the overall aggregate data size from the large number of data
>> sources is hard to accommodate in a single RDF store; it might force
>> you to querying the data sources rather than translating all of them
>> into an RDF store then querying. *Second* factor is the dynamic nature
>> of the data sources; with highly dynamic content I suspect querying
>> the data sources is better.
>> 3. Mapping on Demand:
>>>> The Union Bomb:
>> AKE> I am not sure why this is a problem under ˇ°Mapping on Demand?ˇ± It
>> seems to be the same issues between ETL and on-demand mapping.
>> P.S. Most EIS (CRM like Vignette or ERP like SAP) servers do not
>> expose SQL interface. I assume you meant conceptually in the context
>> of multiple databases.
>> 4. Criteria of Success:
>>>> At the endˇ­. There should exist at least two interoperable
>> implementations of the mapping language providing at least ETL. Aside
>> from this, implementers are encouraged to support on-demand mapping.
>> AKE> No, we need to support both approaches as 1^st class citizens. I
>> view on-demand as more critical as having minimal cost for translation
>> is more critical than the ETL approach. Second, with either ETL or
>> on-demand we need to have a proof of concept or recommendation for how
>> to reconcile RDF sub-graphs out of multiple data sources into a single
>> domain ontology.
>> Regards,
>> Ahmed
>> Ahmed K. Ezzat, Ph.D.
>> HP Fellow, Business Intelligence Software Division*
>> Hewlett-Packard Corporation *
>> 11000 Wolf Road, Bldg 42 Upper, MS 4502, Cupertino, CA 95014-0691 *
>> Office*: *Email*: _Ahmed.Ezzat@hp.com_ <mailto:Ahmed.Ezzat@hp.com>
>> *Tel*: 408-447-6380 *Fax*: 1408796-5427 *Cell*: 408-504-2603*
>> Personal*: *Email*: _AhmedEzzat@aol.com_ <mailto:AhmedEzzat@aol.com>
>> *Tel*: 408-253-5062 *Fax*: 408-253-6271



Kingsley Idehen	      Weblog: http://www.openlinksw.com/blog/~kidehen
President & CEO 
OpenLink Software     Web: http://www.openlinksw.com
Received on Tuesday, 6 January 2009 16:13:16 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 19:51:39 UTC