Re: Public SPARQL endpoints:managing (mis)-use and communicating limits to users.

On Thu, Apr 18, 2013 at 7:53 AM, Jerven Bolleman <jerven.bolleman@isb-sib.ch
> wrote:

> Last but not least how can we avoid that users need to run SELECT
> (COUNT(DISTINT(?s) as ?sc} WHERE {?s ?p ?o} and friends.


I am interested in why queries like this are not optimized. Seems to me
that this should be straightforward to optimize by looking at index
structures.

It's always rather disappointing to me that basic queries like this aren't
very fast. I remember that we had a stored procedure for listing the
predicates used in the store. It ran in a fraction of a second, while the
straightforward query took ages.

Rather than struggling to have users avoid basic, useful queries, how about
making them work well.

As use evolves, people reach a level where they do need to be cognizant of
how queries are run. At that point, there's not a simple way to say which
queries to avoid.

The most useful tools to have are those that expose query plans as clearly
as possible, highlight which parts of them are taking lots of time, and
have a reference page that helps people configure their database, or
reformulate queries to address the execution problems that arise. A first
step towards this, if you are using virtuoso, is to always ask for the
query cost and display it with a link to ask for the query plan. With a
little more work you can speculatively run the query for a bit and if it
times out, with the error message display (or provide in the error message)
the query plan as discussed above. If you want to give your users a little
more control and think they will take advantage of it, you could add some
way for them to say their guess of whether the query is easy, moderate, or
hard, and allocate time to the query appropriately (e.g have
buttons/services easy, moderate, or hard in place of a single execute query
button).

Here's a couple of pages we had compiled about performance. I expect they
are out of date as we haven't tended to them in a few years, but perhaps
they will be of use to someone.

http://neurocommons.org/page/Virtuoso_performance

Received on Thursday, 18 April 2013 15:34:14 UTC