W3C home > Mailing lists > Public > public-lod@w3.org > November 2011

RE: Benchmarking with Named Graphs

From: Orri Erling <erling@xs4all.nl>
Date: Wed, 30 Nov 2011 18:26:12 +0100
To: "'Marcus Cobden'" <lists@marcuscobden.co.uk>, <public-lod@w3.org>
Message-ID: <015c01ccaf85$28b7b120$7a271360$@xs4all.nl>

The Berlin SPARQL Benchmarkk (BSBM)generator, I think,   can make many
graphs, split by type of entity.  
All the billion triple challenge data sets consist of a ton of graphs with
10-1000 triples per graph.

So to benchmark with many graphs the billion triples sets are best, they
also contain every aberration and abuse of diverse vocabularies and syntax,
which is good for their intended purpose.

Aside the case where graph marks provenance, there are not very many use
cases with a lot of graphs.  For web crawls  where one makes a graph per
page this is different.  For these cases, the more selective key is still
the s or the o and not the g.  So for query optimization the large number of
graphs does not make a big difference.  Having a lot of different values for
g will cause quads to take more space since g no longer will compress away,
aside this little difference is expected.


-----Original Message-----
From: Marcus Cobden [mailto:lists@marcuscobden.co.uk] 
Sent: Wednesday, November 30, 2011 2:34 PM
To: public-lod@w3.org
Subject: Benchmarking with Named Graphs

Does anyone know of any multi-graph benchmarking datasets?

So rather a dataset being just one big bag of triples to test with, they're
split into multiple named graphs.

Marcus Cobden
Received on Wednesday, 30 November 2011 17:27:23 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 15:16:17 UTC