Re: Follow up on our conference call on 7/11...

Ezzat, Ahmed wrote:
> I am not up to speed to what Virtuoso do, i.e., I do not know if what Virtuoso do will work in my scenario.
>
> But a data warehouse in our environment is 100+ TB which would be considered one data source in the enterprise. Do you see converting that size of data into RDF (i.e., as described in my first approach) as viable?
>   
It can be converted, this is a data center matter if warehousing is the 
ultimate solution. But, I wouldn't take the warehousing route if I can 
create RDF Views of the SQL Data :-)  Our RDB to RDF mapping is all 
about using SQL optimization heuristics to deliver high-performance and 
scalable RDF Views of SQL Data.

I am confident with an appropriately configured data center plus 
Virtuoso  Cluster Edition using RDF Views or RDF warehousing your 
challenge is addressable. In our tests with the TPC-H benchmark, we've 
been able to get RDF Views to outperform RDF warehousing, so warehousing 
is purely a last resort option at best.

Kinglsey
> Ahmed
>
> -----Original Message-----
> From: Kingsley Idehen [mailto:kidehen@openlinksw.com]
> Sent: Wednesday, July 16, 2008 7:16 PM
> To: Ezzat, Ahmed
> Cc: public-xg-rdb2rdf@w3.org
> Subject: Re: Follow up on our conference call on 7/11...
>
> Ezzat, Ahmed wrote:
>   
>> Hello,
>> This is a question that I would be interested in hearing your reaction
>> and views about.
>> In a multiple data sources environment where some of them are huge
>> like data warehouses, it seems like transforming all data sources into
>> RDF then querying that RDF store using SPARQL is going to put too much
>> pressure on the RDF store beyond reasonable. In addition all changes
>> in these data sources need to be reflected in the RDF store as soon as
>> possible. In the above paragraph I am ignoring the notion of local and
>> domain Ontologies.
>> An alternative I am exploring is to decompose the user query into set
>> of subqueries (SQL and Search) operations to the relevant data sources
>> (i.e., context) à transform the results into RDF using local
>> Ontologies then resolve differences using the domain ontology à apply
>> the SPARQL query on the union of the RDF graphs after reconciliation.
>> Even this approach is far better from RDF storage point of view (i.e.,
>> scalability), it seems like response time can be less than desirable?
>> Comments and thoughts including additional alternatives...
>>     
> Ezzat,
>
> All I can say without additional detail is that shouldn't jump to
> conclusions about the scalability of RDF engines re. the warehousing
> approach or the sophistication of SQL optimizers when injected into the
> SQL-RDF mapping realm.
>
> Virtuoso offers solutions for the RDF warehousing and RDF Views
> approaches. I am certainly happy to be proven wrong via experimentation
> re. Virtuoso's ability to handle either approach without compromising
> performance or scalability.
>
> Virtuoso has been designed and engineered to handle heavy duty RDF data
> management (physical or virtual) from the get go.
>
> Please provide me with additional details about database counts and
> sizes etc..
>
>
> Kingsley
>
>   
>> Regards,
>> Ahmed
>> /*Ahmed K. Ezzat, Ph.D.*//* */
>> *HP Fellow*, *Business Intelligence Software Division
>> **Hewlett-Packard Corporation** *
>> 19333 Vallco Parkway, MS 4502, Cupertino, CA 95014-2599*
>> **Office*: *Email*: _Ahmed.Ezzat@hp.com_ <mailto:Ahmed.Ezzat@hp.com>
>> *Tel*: 408-285-6022 *Fax*: 408-285-1430
>> *Personal*: *Email*: _AhmedEzzat@aol.com_ <mailto:AhmedEzzat@aol.com>
>> *Tel*: 408-253-5062 *Fax*: 408-253-6271
>>
>> ------------------------------------------------------------------------
>>
>>     
>
>
> --
>
>
> Regards,
>
> Kingsley Idehen       Weblog: http://www.openlinksw.com/blog/~kidehen
> President & CEO
> OpenLink Software     Web: http://www.openlinksw.com
>
>
>
>
>
>   


-- 


Regards,

Kingsley Idehen	      Weblog: http://www.openlinksw.com/blog/~kidehen
President & CEO 
OpenLink Software     Web: http://www.openlinksw.com

Received on Thursday, 17 July 2008 04:50:52 UTC