- From: Chris Bizer <chris@bizer.de>
- Date: Tue, 22 Feb 2011 16:02:23 +0100
- To: <public-sparql-dev@w3.org>, "'Semantic Web'" <semantic-web@w3.org>, "'public-lod'" <public-lod@w3.org>
- Message-ID: <034001cbd2a1$8271e120$8755a360$@bizer.de>
Hi all, we are happy to announce Version 3 of the Berlin SPARQL Benchmark as well as the results of a benchmark experiment in which we compared the query, load and update performance of Virtuoso, Jena TDB, 4store, BigData, and BigOWLIM using the new benchmark. The Berlin SPARQL Benchmark Version 3 (BSBM V3) defines three different query mixes that test different capabilities of RDF stores: 1. The Explore query mix test the query performance with simple SPARQL 1.0. 2. The Explore-and-Update query mix test the read and write performance using SPARQL 1.0 SELECT queries as well as SPARQL 1.1 Update queries. 3. The Business Intelligence query mix consists of complex SPARQL 1.1 queries that rely on aggregation as well as subqueries and each touches large parts of the test dataset. The BSBM V3 specification is found at http://www4.wiwiss.fu-berlin.de/bizer/BerlinSPARQLBenchmark/spec/20101129/ We also conducted a benchmark experiment in which we compared query, load and update performance of Virtuoso, Jena TDB, 4store, BigData, and BigOWLIM using the new benchmark. We tested the stores with for 100 million triple and 200 million triple data sets and ran the Explore as well as the Explore-And-Update query mixes. The results of this experiment are found at http://www4.wiwiss.fu-berlin.de/bizer/BerlinSPARQLBenchmark/results/V6/index .html It is interesting to see that: 1. Virtuoso dominates the Explore use case for multiple clients. 2. BigOwlim also shows good multi-client scaling behavior for the 100M dataset. 3. 4store is the fastest store for the Explore-And-Update query mix. 4. BigOwlim is able to load the 200m dataset in under 40 minutes, which comes near the bulk load times of relational databases like MySQL. 5. All stores that we have previously tested with BSBM V2 improved their query performance and load times. We also tried to run the Business Intelligence query mix against the stores. BigData and 4store currently do not provide all SPARQL features that are required to run the BI query mix. We thus tried to run the Business Intelligence query mix only against Virtuoso, TDB and BigOwlim. Doing this, we ran into several "technical problems" that prevented us from finishing the tests and from reporting meaningful results. We thus decided to give the store vendors more time to fix and optimize their stores and will run the BI query mix experiment again in about four months (July 2011). Thanks a lot to Orri Erling for his proposal to have the Business Intelligence use case and initial queries for the query mix. Lots of thanks also go to Ivan Mikhailov for his in-depth review of the Business Intelligence query mix and for finding several bugs in the queries. We also want to thank Peter Boncz and Hugh Williams for feedback on the new version of the BSBM benchmark. We want to thank the store vendors and implementers for helping us to setup and configure their stores for the experiment. Lots of thanks to Andy Seaborne, Ivan Mikhailov, Hugh Williams, Zdravko Tashev, Atanas Kiryakov, Barry Bishop, Bryan Thompson, Mike Personick and Steve Harris. The work on the BSBM Benchmark Version 3 is funded by the LOD2 - Creating Knowledge out of Linked Data project (http://lod2.eu/). More information about the Berlin SPARQL Benchmark is found at http://www4.wiwiss.fu-berlin.de/bizer/BerlinSPARQLBenchmark/ Cheers, Andreas Schultz and Chris Bizer -- Prof. Dr. Christian Bizer Web-based Systems Group Freie Universität Berlin +49 30 838 55509 <http://www.bizer.de> http://www.bizer.de <mailto:chris@bizer.de> chris@bizer.de
Received on Tuesday, 22 February 2011 15:02:07 UTC