Re: Follow up on our conference call on 7/11... from ashok malhotra on 2008-07-17 (public-xg-rdb2rdf@w3.org from July 2008)

From: ashok malhotra <ashok.malhotra@oracle.com>
Date: Thu, 17 Jul 2008 09:35:10 -0700
To: "Ezzat, Ahmed" <Ahmed.Ezzat@hp.com>
CC: "public-xg-rdb2rdf@w3.org" <public-xg-rdb2rdf@w3.org>
Message-ID: <487F74BE.40308@oracle.com>

Hello Ahmed:
Thank you for starting this thread.

In my view, there are situations where you want to translate the data to 
RDF and then store it and query it
but, if the data is very large and/or changes frequently a better 
approach is to leave the data in the native
database and create a virtual RDF representation for it -- which I call 
a semantic cover. The semantic
cover can then be queried with SPARQL and the SPARQL queries translated 
to queries over the native
databases.

It should also be possible to enrich the semantic cover with additional 
semantics but exactly how this would
be done needs to be worked out.

A recommendation that our XG may want to make to the W3C is to start 
work on a language that would map
relational data to RDF. The mapping may be used to translate the data to 
RDF and store it in a RDF database
or it could be used to create a virtual mapping as discussed above.

We have heard a number of presentations on quick default mappings of 
Relational data to RDF. But we also
need the ability to customize these mappings and add additional semantics.

This approach starts with the Relational database schema. An alternative 
approach may be to create an ontology
first and then create (distributed) SQL queries to answer questions 
about the ontologies.

Ahmed, does that cover what you had in mind?

All, please respond to this note so we can start coming to a shared 
understanding as to what we should
recommend to the W3C.

All the best, Ashok

Ezzat, Ahmed wrote:
> Hello,
> This is a question that I would be interested in hearing your reaction 
> and views about.
> In a multiple data sources environment where some of them are huge 
> like data warehouses, it seems like transforming all data sources into 
> RDF then querying that RDF store using SPARQL is going to put too much 
> pressure on the RDF store beyond reasonable. In addition all changes 
> in these data sources need to be reflected in the RDF store as soon as 
> possible. In the above paragraph I am ignoring the notion of local and 
> domain Ontologies.
> An alternative I am exploring is to decompose the user query into set 
> of subqueries (SQL and Search) operations to the relevant data sources 
> (i.e., context) à transform the results into RDF using local 
> Ontologies then resolve differences using the domain ontology à apply 
> the SPARQL query on the union of the RDF graphs after reconciliation. 
> Even this approach is far better from RDF storage point of view (i.e., 
> scalability), it seems like response time can be less than desirable?
> Comments and thoughts including additional alternatives…
> Regards,
> Ahmed
> /*Ahmed K. Ezzat, Ph.D.*//* */
> *HP Fellow*, *Business Intelligence Software Division
> **Hewlett-Packard Corporation** *
> 19333 Vallco Parkway, MS 4502, Cupertino, CA 95014-2599*
> **Office*: *Email*: _Ahmed.Ezzat@hp.com_ <mailto:Ahmed.Ezzat@hp.com> 
> *Tel*: 408-285-6022 *Fax*: 408-285-1430
> *Personal*: *Email*: _AhmedEzzat@aol.com_ <mailto:AhmedEzzat@aol.com> 
> *Tel*: 408-253-5062 *Fax*: 408-253-6271
>
> ------------------------------------------------------------------------
>

Received on Thursday, 17 July 2008 16:37:50 UTC