- From: Orri Erling <erling@xs4all.nl>
- Date: Wed, 30 Nov 2011 18:26:12 +0100
- To: "'Marcus Cobden'" <lists@marcuscobden.co.uk>, <public-lod@w3.org>
Hi The Berlin SPARQL Benchmarkk (BSBM)generator, I think, can make many graphs, split by type of entity. All the billion triple challenge data sets consist of a ton of graphs with 10-1000 triples per graph. So to benchmark with many graphs the billion triples sets are best, they also contain every aberration and abuse of diverse vocabularies and syntax, which is good for their intended purpose. Aside the case where graph marks provenance, there are not very many use cases with a lot of graphs. For web crawls where one makes a graph per page this is different. For these cases, the more selective key is still the s or the o and not the g. So for query optimization the large number of graphs does not make a big difference. Having a lot of different values for g will cause quads to take more space since g no longer will compress away, aside this little difference is expected. Orri -----Original Message----- From: Marcus Cobden [mailto:lists@marcuscobden.co.uk] Sent: Wednesday, November 30, 2011 2:34 PM To: public-lod@w3.org Subject: Benchmarking with Named Graphs Does anyone know of any multi-graph benchmarking datasets? So rather a dataset being just one big bag of triples to test with, they're split into multiple named graphs. Thanks, Marcus Cobden
Received on Wednesday, 30 November 2011 17:27:23 UTC