Re: Public SPARQL endpoints:managing (mis)-use and communicating limits to users.

On 19-Apr-13, at 8:04 AM, Leigh Dodds wrote:
> Hi,
> On Fri, Apr 19, 2013 at 11:55 AM, Kingsley Idehen
> <kidehen@openlinksw.com> wrote:
>> ...
>> If you have OFFSET and LIMIT in use, you can reflect the new state of
>> affairs when the next GET is performed i.e, lets say you have  
>> OFFSET 20 and
>> LIMIT 20, the URL with OFFSET 40 is the request for the next batch of
>> results from the solution and the one that would reflect the new  
>> state of
>> affairs.
>
> This requires the client to page from the outset. Ideally there would
> be a way for a server to force paging where it needed to. At the
> moment though there's no way for a server to indicate that its done
> that, e.g. by including a "next page" link in the results.

Is this not what Chunked Transfer Encoding could be used for? Some of  
my own problems with misuse are people who try to fetch large queries  
through concurrent OFFSET / LIMIT queries over different connections.

I believe the core of the problem is that we need to provide some  
support for negotiation between the clients and servers as to whom  
will do what work when. Maximum triple count and the maximum query  
runtime are a good start, perhaps we could even look at having an  
overloaded SPARQL server leverage <owl:sameAs> and void entries to  
reply "307 Temporary Redirect - Too busy to process your request, but  
consider these other servers....".

"202 Accepted" is an interesting idea for long-running queries (add  
expected time to completion somewhere?); can we have a look at the  
client specifying whether he is willing to wait or will try elsewhere?

rhw

Received on Friday, 19 April 2013 14:05:14 UTC