- From: Hugh Williams <hwilliams@openlinksw.com>
- Date: Tue, 9 Aug 2011 13:24:22 +0100
- To: Jörn Hees <j_hees@cs.uni-kl.de>
- Cc: public-lod@w3.org, dbpedia-discussion@lists.sourceforge.net
- Message-Id: <0ED59EAD-DDCD-4A6B-9D7F-00617D778D92@openlinksw.com>
Hi
The http://dbpedia.org/sparql endpoint has both rate limiting on the number of connections/sec you can make, as well as restrictions on resultset and query time, as per the following settings:
[SPARQL]
ResultSetMaxRows = 2000
MaxQueryExecutionTime = 120
MaxQueryCostEstimationTime = 1500
These are in place to make sure that everyone has a equal chance to de-reference data from dbpedia.org, as well as to guard against badly written queries/robots.
The following options are at your disposal to get round these limitations:
1. Use the LIMIT and OFFSET keywords
You can tell a SPARQL query to return a partial result set and how many records to skip e.g.:
select ?s where { ?s a ?o }
LIMIT 1000 OFFSET 2000
2. Setup a dbpedia database in your own network
The dbpedia project provides full datasets, so you can setup your own installation on a sufficiently powerful box using Virtuoso Open Source Edition.
3. Setup a preconfigured installation of Virtuoso + database using Amazon EC2 (not free)
See: http://www.openlinksw.com/dataspace/dav/wiki/Main/VirtAWSDBpedia351C
Best Regards
Hugh Williams
Professional Services
OpenLink Software
Web: http://www.openlinksw.com
Support: http://support.openlinksw.com
Forums: http://boards.openlinksw.com/support
Twitter: http://twitter.com/OpenLink
On 9 Aug 2011, at 13:04, Jörn Hees wrote:
> On 9. Aug. 2011, at 13:15, Pablo Mendes wrote:
>>> 'yes, i also consider DBpedia buggy in this sense (hence the crossposting)'
>> Just a small note.
>> I think you mean that the SPARQL engine behind a particular deployment of DBpedia is behaving differently from what you would desire. Although there are bugs in DBpedia, this is not one of them. :) I think it is important to make this distinction between DBpedia and the SPARQL endpoints serving its contents exactly to point out that you could provide your own implementation/wrapper that sorts/limits results the way you want.
>
> Yes, this was imprecise. I was not talking about the SPARQL endpoint (which in fact is able to return more than 2001 triples per subject). I was talking about the standard thing that many people do with a http URI: dereference it.
>
> I agree that other / local SPARQL endpoints are useful for mass queries and to take load of the DBpedia servers, but i don't see how they help in my case, as dereferencing still goes to the server(s) at dbpedia.org.
>
> Cheers,
> Jörn
>
>
Received on Tuesday, 9 August 2011 12:28:12 UTC