Berlin SPARQL Benchmark V2 - Results for Sesame, Virtuoso, Jena TDB, D2R Server, and MySQL from Chris Bizer on 2008-09-17 (public-sparql-dev@w3.org from July to September 2008)

From: Chris Bizer <chris@bizer.de>
Date: Wed, 17 Sep 2008 16:53:00 +0200
To: <semantic-web@w3.org>, <public-sparql-dev@w3.org>, <public-lod@w3.org>
Cc: "Andreas \(1\) Schultz" <aschultz@mi.fu-berlin.de>
Message-ID: <0309D91EA705470297039D27EB95DEE2@wrz03715>

Hi all,

over the last weeks, we have extended the Berlin SPARQL Benchmark
(BSBM) to a multi-client scenario, fine-tuned the benchmark dataset
and the query mix, and implemented a SQL version of the benchmark in
order to be able to compare SPARQL stores with classical SQL stores.

Today, we have released the results of running the BSBM Benchmark
Version 2 against:

+ three RDF stores (Virtuoso Version 5.0.8, Sesame Version 2.2, Jena
TDB Version 0.53) and
+ two relational database-to-RDF wrappers (D2R Server Version 0.4 and
Virtuoso - RDF Views Version 5.0.8).

for datasets ranging from 250,000 triples to 100,000,000 triples.

In order to set the SPARQL query performance into context we also
report the results of running the SQL version of the benchmark against
two relational database management systems (MySQL 5.1.26 and
Virtuoso - RDBMS Version 5.0.8).

A comparison of the performance for a single client working against
the stores is found here:

http://www4.wiwiss.fu-berlin.de/bizer/BerlinSPARQLBenchmark/results/index.html#comparison

A comparison of the performance for 1 to 16 clients simultaneously
executing query mixes against the stores is found here:

http://www4.wiwiss.fu-berlin.de/bizer/BerlinSPARQLBenchmark/results/index.html#multiResults

The complete benchmark results including the setup of the experiment
and the configuration of the different stores is found here:

http://www4.wiwiss.fu-berlin.de/bizer/BerlinSPARQLBenchmark/results/index.html

The current specification of the Berlin SPARQL Benchmark is found
here:

http://www4.wiwiss.fu-berlin.de/bizer/BerlinSPARQLBenchmark/spec/20080912/

It is interesting to see:

1. that relational database to RDF wrappers generally outperform RDF
stores for larger dataset sizes.
2. that no store outperforms the others for all queries and dataset
sizes.
3. that the query throughput still varies widely within the
multi-client scenario.
4. that the fastest RDF store is still 7 times slower than a
relational database.

Thanks a lot to

+ Eli Lilly and Company and especially Susie Stephens for making this
work possible through a research grant.
+ Orri Erling, Andy Seaborne, Arjohn Kampman, Michael Schmidt, Richard
Cyganiak, Ivan Mikhailov, Patrick van Kleef, and Christian Becker for
their feedback on the benchmark design and their help with configuring
the stores and running the benchmark experiment.

Without all your help it would not been possible to conduct this
experiment.

We highly welcome feedback on the benchmark design and the results of
the experiment.

Cheers,

Chris Bizer and Andreas Schultz

--
Prof. Dr. Chris Bizer
Freie Universität Berlin
Phone: +49 30 838 55509
Mail: chris@bizer.de
Web: www.bizer.de

Received on Wednesday, 17 September 2008 14:54:17 UTC