Re: Berlin SPARQL Benchmark V2 - Results for Sesame, Virtuoso, Jena TDB, D2R Server, and MySQL from Eyal Oren on 2008-09-22 (semantic-web@w3.org from September 2008)

From: Eyal Oren <eyal@cs.vu.nl>
Date: Mon, 22 Sep 2008 10:47:19 +0200
To: semantic-web@w3.org, "public-lod@w3.org" <public-lod@w3.org>
Message-ID: <20080922084719.GA22468@localhost>

On 09/19/08/09/08 23:12 +0200, Orri Erling wrote:
>>Has has there been any analysis on whether there is a *fundamental* 
>>reason for such performance difference? Or is it simply a question of 
>>"maturity"; in other words, relational db technology has been around for 
>>a very long time and is very mature, whereas RDF implementations are 
>>still quite recent, so this gap will surely narrow ...?
>This is a very complex subject.  I will offer some analysis below, but 
>this I fear will only raise further questions.  This is not the end of the 
>road, far from it.
As far as I understand, another issue is relevant: this benchmark is 
somewhat unfair as the relational stores have one advantage compared to the 
native triple stores: the relational data structure is fixed (Products, 
Producers, Reviews, etc with given columns), while the triple 
representation is generic (arbitrary s,p,o).

One can question whether such flexibility is relevant in practice, and if 
so, one may try to extract such structured patterns from data on-the-fly. 
Still, it's important to note that we're comparing somewhat different 
things here between the relational and the triple representation of the 
benchmark. 

  -eyal

PS: the benchmark is great, really, possible improvements notwithstanding.

Received on Tuesday, 23 September 2008 08:22:47 UTC