W3C home > Mailing lists > Public > public-xg-rdb2rdf@w3.org > July 2008

Re: Follow up on our conference call on 7/11...

From: Kingsley Idehen <kidehen@openlinksw.com>
Date: Thu, 17 Jul 2008 00:50:11 -0400
Message-ID: <487ECF83.9080500@openlinksw.com>
To: "Ezzat, Ahmed" <Ahmed.Ezzat@hp.com>
CC: "public-xg-rdb2rdf@w3.org" <public-xg-rdb2rdf@w3.org>

Ezzat, Ahmed wrote:
> I am not up to speed to what Virtuoso do, i.e., I do not know if what Virtuoso do will work in my scenario.
> But a data warehouse in our environment is 100+ TB which would be considered one data source in the enterprise. Do you see converting that size of data into RDF (i.e., as described in my first approach) as viable?
It can be converted, this is a data center matter if warehousing is the 
ultimate solution. But, I wouldn't take the warehousing route if I can 
create RDF Views of the SQL Data :-)  Our RDB to RDF mapping is all 
about using SQL optimization heuristics to deliver high-performance and 
scalable RDF Views of SQL Data.

I am confident with an appropriately configured data center plus 
Virtuoso  Cluster Edition using RDF Views or RDF warehousing your 
challenge is addressable. In our tests with the TPC-H benchmark, we've 
been able to get RDF Views to outperform RDF warehousing, so warehousing 
is purely a last resort option at best.

> Ahmed
> -----Original Message-----
> From: Kingsley Idehen [mailto:kidehen@openlinksw.com]
> Sent: Wednesday, July 16, 2008 7:16 PM
> To: Ezzat, Ahmed
> Cc: public-xg-rdb2rdf@w3.org
> Subject: Re: Follow up on our conference call on 7/11...
> Ezzat, Ahmed wrote:
>> Hello,
>> This is a question that I would be interested in hearing your reaction
>> and views about.
>> In a multiple data sources environment where some of them are huge
>> like data warehouses, it seems like transforming all data sources into
>> RDF then querying that RDF store using SPARQL is going to put too much
>> pressure on the RDF store beyond reasonable. In addition all changes
>> in these data sources need to be reflected in the RDF store as soon as
>> possible. In the above paragraph I am ignoring the notion of local and
>> domain Ontologies.
>> An alternative I am exploring is to decompose the user query into set
>> of subqueries (SQL and Search) operations to the relevant data sources
>> (i.e., context)  transform the results into RDF using local
>> Ontologies then resolve differences using the domain ontology  apply
>> the SPARQL query on the union of the RDF graphs after reconciliation.
>> Even this approach is far better from RDF storage point of view (i.e.,
>> scalability), it seems like response time can be less than desirable?
>> Comments and thoughts including additional alternatives...
> Ezzat,
> All I can say without additional detail is that shouldn't jump to
> conclusions about the scalability of RDF engines re. the warehousing
> approach or the sophistication of SQL optimizers when injected into the
> SQL-RDF mapping realm.
> Virtuoso offers solutions for the RDF warehousing and RDF Views
> approaches. I am certainly happy to be proven wrong via experimentation
> re. Virtuoso's ability to handle either approach without compromising
> performance or scalability.
> Virtuoso has been designed and engineered to handle heavy duty RDF data
> management (physical or virtual) from the get go.
> Please provide me with additional details about database counts and
> sizes etc..
> Kingsley
>> Regards,
>> Ahmed
>> /*Ahmed K. Ezzat, Ph.D.*//* */
>> *HP Fellow*, *Business Intelligence Software Division
>> **Hewlett-Packard Corporation** *
>> 19333 Vallco Parkway, MS 4502, Cupertino, CA 95014-2599*
>> **Office*: *Email*: _Ahmed.Ezzat@hp.com_ <mailto:Ahmed.Ezzat@hp.com>
>> *Tel*: 408-285-6022 *Fax*: 408-285-1430
>> *Personal*: *Email*: _AhmedEzzat@aol.com_ <mailto:AhmedEzzat@aol.com>
>> *Tel*: 408-253-5062 *Fax*: 408-253-6271
>> ------------------------------------------------------------------------
> --
> Regards,
> Kingsley Idehen       Weblog: http://www.openlinksw.com/blog/~kidehen
> President & CEO
> OpenLink Software     Web: http://www.openlinksw.com



Kingsley Idehen	      Weblog: http://www.openlinksw.com/blog/~kidehen
President & CEO 
OpenLink Software     Web: http://www.openlinksw.com
Received on Thursday, 17 July 2008 04:50:52 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 19:51:39 UTC