W3C home > Mailing lists > Public > semantic-web@w3.org > September 2008

DESCRIBE optimizations (was RE: Berlin SPARQL Benchmark V2 - Results for Sesame, Virtuoso, Jena TDB, D2R Server, and MySQL)

From: Kjetil Kjernsmo <Kjetil.Kjernsmo@computas.com>
Date: Wed, 24 Sep 2008 10:30:16 +0200
To: semantic-web@w3.org
Message-Id: <200809241030.16696.Kjetil.Kjernsmo@computas.com>

Dear all,

I would like to thank Prof. Bizer and his group for undertaking this very 
interesting study, and to Orri and the participants here for interesting 
elaboration.

I'm in the process of evaluating several SPARQL backends for use in several 
projects, and right now, I'm looking at Virtuoso. 

I noticed this:
>A lesser item in the same  direction is the use of describe, which is not
>commensurate between SPARQL and SQL and not even between SPARQL's. 

Even though SPARQL DESCRIBE is not standardised, we have found it extremely 
useful as a "give me every thing you know about </foo>" query. Also we found 
it useful that it preserves the semantics of the original data. Thus, most of 
the queries we ask are DESCRIBEs. Indeed, we have some performance issues.

I would like to hear your opinion on a possible optimisation in light of this:
> The BSBM workload typically retrieves multiple dependent attributes of a 
> single key.  If these attributes are all next to each other, as in a 
> relational row store, then we have a constant time for the extra attribute 
> instead of a log of the database size. 

This is very interesting and I can see why this is so, but we look upon our 
DESCRIBEs as retrieving multiple attributes of the same single key (URI). 
Thus, would it be possible to optimize for this situation somehow, by putting 
these attributes next to each other?

As an aside, we don't do this a lot now, but it seems like an important case 
to quickly retrieve all data of something that you know only an IFP of, e.g.

DESCRIBE ?user WHERE { ?user foaf:mbox "dahut@example.org" . }

Given that you know which node to DESCRIBE at the first hit if foaf:mbox is an 
IFP, is this is a situation that could be optimized for?

Kind regards 

Kjetil Kjernsmo
-- 
Senior Knowledge Engineer
Mobile: +47 986 48 234
Email: kjetil.kjernsmo@computas.com   
Web: http://www.computas.com/

|  SHARE YOUR KNOWLEDGE  |

Computas AS  PO Box 482, N-1327 Lysaker | Phone:+47 6783 1000 | Fax:+47 6783 
1001
Received on Wednesday, 24 September 2008 08:37:12 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 21:45:25 GMT