W3C home > Mailing lists > Public > public-sparql-dev@w3.org > April to June 2006

RE: sparql performance questions

From: Seaborne, Andy <andy.seaborne@hp.com>
Date: Tue, 20 Jun 2006 12:22:44 +0100
Message-ID: <DF5E364A470421429AE6DC96979A4F6FAFF6F7@sdcexc04.emea.cpqcorp.net>
To: "Denis Gaertner" <denis_gaertner@gmx.net>, <public-sparql-dev@w3.org>

-------- Original Message --------
> From: Denis Gaertner <>
> Date: 19 June 2006 19:30
> Hi!
> I got some questions concerning performance optimization in sparql
> queries. Since I couldn't find any good information on the net about
> this(please correct me if this is wrong) I thought this might be the
> right place for it.   

It's a bit early in the SPARQL lifecycle as the spec is in
"implementability" phase (CR in W3C speak).

> So I am wondering how the structure of a complex query affects the
> processing and query time. You can group Graphpatterns to an
> arbitrarily depth and I'd like to know if a structure which is
> equivalent maybe to something like DNF or CNF in feature sets could
> help. So for example : { {?s p "o"}  
> {?s p2 "o2"}
> }
> {?s p3 "o3"}
> becomes
> {?s p "o" . ?s p3 "o3"}
> {?s p2 "o2" . ?s p3 "o3"}
> Now this put in a large scale means breaking down lots of levels into
> fewer. Is this of advantage or is the rdf graph structure not suitable
> for this kind of transformation? Or is it solely a question of
> implementation?   

It's highly implementation dependent.  Consider a system that can only
indexes triples but in a local store, compared to one that can ship a
conjunctive expression (or more complex) to a conjunction engine (a SQL
join expression) but incurs overhead in doing so.  Or a system that can
dispatch parallel subqueries to a cluster but has lots of spare
processing power (probably better to use the latter form and combine the
branches) to tradeoff query elapsed time against CPU cycles.

> And another thing is about constraints. Is it better to put them into
> group or to collect them and put them at the end if possible? So for 
> instance:
> {?s p ?o1 FILTER (?o1...)}
> {?s p2 ?o2 FILTER (?o2...)}
> or
> {?s p ?o1} FILTER (?o1...)
> {?s p2 ?o2} FILTER (?o2...)
> or
> {?s p ?o1}
> {?s p2 ?o2}
> FILTER (?o1... && ?o2...)
> So this is it and since I don't know anything on how these queries are
> processed I am not able to figure that out by myself. 

Style-wise: write as you think is clearest.

Performance-wise: I'd expect most implementations to put the filter in
the best place (it's relatively easy to do although combined with
reordering the triple patterns there two dimensions of change to


> Thanks
> Denis
Received on Tuesday, 20 June 2006 12:14:12 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 20:15:49 UTC