Re: How to publish SPARQL endpoint limits/metadata?

On 10/9/13 5:45 AM, Frans Knibbe | Geodan wrote:
> Hello,
>
> If it has really been established that there is no standardized way to 
> make these things like service limitations known I think it would be a 
> nice idea to try to initiate a community project.
>
> Below are some things that I think would benefit communication between 
> SPARQL endpoints and user agents. Please know that I am a novice in 
> Linked Data, so perhaps some of these are already covered by existing 
> standards or best practices, or do not make sense.
>
>  1. The maximum number of results per request (hard limit)
>  2. The amount of remaining requests (This could be used for a
>     throttling mechanism that allows only a certain amount of request
>     per unit of time and IP address. I remember having talked to a
>     data service on the web where this information was put in the HTTP
>     response headers)
>  3. The time period of the next scheduled downtime
>  4. The version(s) of the protocol that are supported
>  5. (the URI of) a document that contains a human readable SLA or fair
>     use policy for the service
>  6. URIs of mirrors
>
> Regards,
> Frans
>

Plus:

7. query timeout (in milliseconds)  -- which determines how much 
processing time threshold per query .

Ideally, you want to use a combination of timeouts, result size (max. 
results per query), offet, and limit to enable paging through data .

Kingsley
>
>
> On 8-10-2013 17:45, Leigh Dodds wrote:
>> Hi,
>>
>> As others have suggested, extending service descriptions would be the
>> best way to do this. This might make a nice little community project.
>>
>> It would be useful to itemise a list of the type of limits that might
>> be faced, then look at how best to model them.
>>
>> Perhaps something we could do on the list?
>>
>> Cheers,
>>
>> L.
>>
>>
>>
>> On Tue, Oct 8, 2013 at 10:46 AM, Frans Knibbe | Geodan
>> <frans.knibbe@geodan.nl>  wrote:
>>> Hello,
>>>
>>> I am experimenting with running SPARQL endpoints and I notice the need to
>>> impose some limits to prevent overloading/abuse. The easiest and I believe
>>> fairly common way to do that is to LIMIT the number of results that the
>>> endpoint will return for a single query.
>>>
>>> I now wonder how I can publish the fact that my SPARQL endpoint has a LIMIT
>>> and that is has a certain value.
>>>
>>> I have read the thread Public SPARQL endpoints:managing (mis)-use and
>>> communicating limits to users, but that seemed to be about how to
>>> communicate limits during querying. I would like to know if there is a way
>>> to communicate limits before querying is started.
>>>
>>> It seems to me that a logical place to publish a limit would be in the
>>> metadata of the SPARQL endpoint. Those metadata could contain all limits
>>> imposed on the endpoint, and perhaps other things like a SLA or a
>>> maintenance schedule... data that could help in the proper use of the
>>> endpoint by both software agents and human users.
>>>
>>> So perhaps my enquiry really is about a standard for publishing SPARQL
>>> endpoint metadata, and how to access them.
>>>
>>> Greetings,
>>> Frans
>>>
>>>
>>> --------------------------------------
>>> Geodan
>>> President Kennedylaan 1
>>> 1079 MB Amsterdam (NL)
>>>
>>> T +31 (0)20 - 5711 347
>>> Efrans.knibbe@geodan.nl
>>> www.geodan.nl  | disclaimer
>>> --------------------------------------
>>
>
>
> -- 
> --------------------------------------
> *Geodan*
> President Kennedylaan 1
> 1079 MB Amsterdam (NL)
>
> T +31 (0)20 - 5711 347
> E frans.knibbe@geodan.nl
> www.geodan.nl <http://www.geodan.nl> | disclaimer 
> <http://www.geodan.nl/disclaimer>
> --------------------------------------


-- 

Regards,

Kingsley Idehen	
Founder & CEO
OpenLink Software
Company Web: http://www.openlinksw.com
Personal Weblog: http://www.openlinksw.com/blog/~kidehen
Twitter/Identi.ca handle: @kidehen
Google+ Profile: https://plus.google.com/112399767740508618350/about
LinkedIn Profile: http://www.linkedin.com/in/kidehen

Received on Wednesday, 9 October 2013 11:21:14 UTC