W3C home > Mailing lists > Public > public-rdf-dawg@w3.org > January to March 2006

Re: SHOULD use POST for expensive queries?

From: Steve Harris <S.W.Harris@ecs.soton.ac.uk>
Date: Wed, 18 Jan 2006 17:18:12 +0000
To: dawg mailing list <public-rdf-dawg@w3.org>
Message-ID: <20060118171812.GB29792@login.ecs.soton.ac.uk>

On Wed, Jan 18, 2006 at 10:38:03 -0500, Kendall Clark wrote:
> On Jan 18, 2006, at 10:26 AM, Seaborne, Andy wrote:
> >> Even very sophisticated query analysis can't tell  you which RDF  
> >>datasets are expensive to assemble.
> >
> >Very true.  It's not just the query that determines whether it will  
> >be expensive - it's the dataset as well (and the sever load).
> Actually, now that I think about it, that's not *entirely* true. Real  
> (as opposed to toy) database cost models include table size, and even  
> for arbitrary 3rd party graphs, with clever caching and use of HTTP,  
> a SPARQL query analyzer could make some good guesses (so, imagine the  
> ideal case: all the graphs are cached locally and known to be fresh),  
> so it's not as bad as I made it seem.

It's not even that easy, without running the main part of the query for
real it's not possibly to calculate how much effort will be required to
satisfy the OPTIONAL blocks or how many UNION branches you will have to
take. My estimation code assumes worst-ish cases for those, which is often
not accurate.
> But in the common or pathological cases (where all graphs are  
> unknown, uncached, and have to be retrieved from arbitrary origin  
> servers), well... -shudder-.


- Steve
Received on Wednesday, 18 January 2006 17:18:26 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 15:00:50 UTC