RE: Berlin SPARQL Benchmark V2 - Results for Sesame, Virtuoso, Jena TDB, D2R Server, and MySQL from Seaborne, Andy on 2008-09-25 (semantic-web@w3.org from September 2008)

From: Seaborne, Andy <andy.seaborne@hp.com>
Date: Thu, 25 Sep 2008 14:24:40 +0000
To: Story Henry <henry.story@bblfish.net>
CC: "semantic-web@w3.org" <semantic-web@w3.org>
Message-ID: <B6CF1054FDC8B845BF93A6645D19BEA35536E74BB7@GVW1118EXC.americas.hpqcorp.net>



> -----Original Message-----
> From: public-lod-request@w3.org [mailto:public-lod-request@w3.org] On
> Behalf Of Story Henry
> Sent: 24 September 2008 22:17
> To: Paul Gearon
> Cc: semantic-web@w3.org; public-lod@w3.org
> Subject: Re: Berlin SPARQL Benchmark V2 - Results for Sesame, Virtuoso,
> Jena TDB, D2R Server, and MySQL
>
>
> As a matter of interest, would it be possible to develop RDF stores
> that optimize the layout of the data by analyzing the queries to the
> database? A bit like a Java Just In Time compiler analyses the usage
> of the classes in order to decide how to optimize the compilation.

On a similar note, by mining the query logs it would be possible to create parameterised queries and associated plan fragments without the client needing to notify the server of the templates.  Couple with automatically calculating possible materialized views or other layout optimizations, the poor, overworked client application writer doesn't get brought into optimizing the server.

        Andy

>
> Henry
>
> On 24 Sep 2008, at 20:30, Paul Gearon wrote:
>
> > A related point is that processing RDF to create an object means you
> > have to move around a lot in the graph. This could mean a lot of
> > seeking on disk, while an RDBMS will usually find the entire object in
> > one place on the disk. And seeks kill performance.
> >
> > This leads to the operations used to build objects from an RDF store.
> > A single object often requires the traversal of several statements,
> > where the object of one statement becomes the subject of the next.
> > Since the tables are typically represented as
> > Subject/Predicate/Object, this means that the main table will be
> > "joined" against itself. Even RDBMSs are notorious for not doing this
> > efficiently.
>

Received on Thursday, 25 September 2008 14:26:01 UTC