Berlin SPARQL Benchmark V2 - Results for Sesame, Virtuoso, Jena TDB, D2R Server, and MySQL

Hi all,

over the last weeks, we have extended the Berlin SPARQL Benchmark 
(BSBM) to a multi-client scenario, fine-tuned the benchmark dataset 
and the query mix, and implemented a SQL version of the benchmark in 
order to be able to compare SPARQL stores with classical SQL stores.

Today, we have released the results of running the BSBM Benchmark 
Version 2 against:

+ three RDF stores (Virtuoso Version 5.0.8, Sesame Version 2.2, Jena 
TDB Version 0.53) and
+ two relational database-to-RDF wrappers (D2R Server Version 0.4 and 
Virtuoso - RDF Views Version 5.0.8).

for datasets ranging from 250,000 triples to 100,000,000 triples.

In order to set the SPARQL query performance into context we also 
report the results of running the SQL version of the benchmark against 
two relational database management systems (MySQL 5.1.26 and 
Virtuoso - RDBMS Version 5.0.8).

A comparison of the performance for a single client working against 
the stores is found here:

http://www4.wiwiss.fu-berlin.de/bizer/BerlinSPARQLBenchmark/results/index.html#comparison

A comparison of the performance for 1 to 16 clients simultaneously 
executing query mixes against the stores is found here:

http://www4.wiwiss.fu-berlin.de/bizer/BerlinSPARQLBenchmark/results/index.html#multiResults

The complete benchmark results including the setup of the experiment 
and the configuration of the different stores is found here:

http://www4.wiwiss.fu-berlin.de/bizer/BerlinSPARQLBenchmark/results/index.html

The current specification of the Berlin SPARQL Benchmark is found 
here:

http://www4.wiwiss.fu-berlin.de/bizer/BerlinSPARQLBenchmark/spec/20080912/

It is interesting to see:

1. that relational database to RDF wrappers generally outperform RDF 
stores for larger dataset sizes.
2. that no store outperforms the others for all queries and dataset 
sizes.
3. that the query throughput still varies widely within the 
multi-client scenario.
4. that the fastest RDF store is still 7 times slower than a 
relational database.

Thanks a lot to

+ Eli Lilly and Company and especially Susie Stephens for making this 
work possible through a research grant.
+ Orri Erling, Andy Seaborne, Arjohn Kampman, Michael Schmidt, Richard 
Cyganiak, Ivan Mikhailov, Patrick van Kleef, and Christian Becker for 
their feedback on the benchmark design and their help with configuring 
the stores and running the benchmark experiment.

Without all your help it would not been possible to conduct this 
experiment.

We highly welcome feedback on the benchmark design and the results of 
the experiment.

Cheers,

Chris Bizer and Andreas Schultz

--
Prof. Dr. Chris Bizer
Freie Universität Berlin
Phone: +49 30 838 55509
Mail: chris@bizer.de
Web: www.bizer.de 

Received on Wednesday, 17 September 2008 14:54:18 UTC