Re: Restpark - Minimal RESTful API for querying RDF triples from Jerven Bolleman on 2013-04-16 (public-lod@w3.org from April 2013)

From: Jerven Bolleman <me@jerven.eu>
Date: Tue, 16 Apr 2013 21:51:36 +0200
To: Aidan Hogan <aidan.hogan@deri.org>
Cc: public-lod@w3.org
Message-Id: <AF023624-D394-45BD-A5E8-A9D4016E83F8@jerven.eu>
Hi Aidan,

I think that you are jumping to conclusions. While a RDF store that can hold any RDF data will answer queries 
in P-Space, however that is an extremely generic solution. There is no need that a fully SPARQL1.1 query compliant engine needs to hold 
true to this [1]. 

This is a thought exercise. Lets assume you have very limited data in your restpark endpoint.

example:a constant:predicate object:a
example:b constant:predicate object:b
.
.
.
example:z constant:predicate object:z

and so on. A subject is never an object in this dataset.

Then in this specific case 

/restpark?subject={subject}&predicate={predicate}&object={object}
can be translated directly from
/sparql?query=SELECT * WHERE {?subject ?predicate ?object}
if your dataset does not allow any joins then your query engine can translate directly from the sparql variant into the restpark call.

So in this limited case SPARQL given this data is in the same complexity call as restpark.
In other words restpark can be hidden behind a SPARQL query engine that limits itself to only answering
/sparql?query=SELECT * WHERE {?subject ?predicate ?object}
Actually looking at the Jena and Sesame API's for triple source running a SPARQL endpoint on top of your API is dead simple. 

So I postulate that for your this use case SPARQL scales as well as restpark (the restpark API being a proper subset of SPARQL)

And you may say but thats insane but I have done similar work for making tab-delimited bioinformatics databases accessible this way [2]. 

The credo that SPARQL does not scale in comparison to restpark does not hold true because restpark cannot answer queries that SPARQL endpoints could.
Not competing in most of the performance race is not the same as winning the race ;)

Regards,
Jerven

[1] If full SPARQL update is supported then things change.
[2] See https://github.com/JervenBolleman/sparql-bed/ for a simple example (just implement the sesame triple source API).

On Apr 16, 2013, at 9:28 PM, Aidan Hogan wrote:

> On 16/04/2013 20:02, Kingsley Idehen wrote:
>> On 4/16/13 2:53 PM, Aidan Hogan wrote:
>>> On 16/04/2013 19:21, Kingsley Idehen wrote:
>>>> SPARQL scales† ...
>>> 
>>> † with the minor exception of answering SPARQL queries.
>>> 
>>> Cheers,
>>> Aidan
>>> 
>>> 
>> Please clarify what you mean, especially with regards to how that could
>> be deficient to what some RESTful API might offer.
> 
> Sure.
> 
> 
> In theory, SPARQL evaluation (answering a SPARQL query) is P-Space Complete [1]. Aside from some nasty OPTIONAL examples it's NP-Complete in query complexity since you're doing graph matching (homomorphisms). It doesn't scale and only selective parts can be parallelised.
> 
> In practice, it is not at all difficult to find (interesting) SPARQL queries that existing engines cannot run completely and correctly. I (unfortunately) find them all the time without even trying.
> 
> SPARQL evaluation does not scale, cannot scale, will never scale.
> 
> But maybe I misunderstand. What do you mean when you say "SPARQL scales"? :)
> 
> 
> With respect to Restpark, binary search is logarithmic and answering a Restpark query would require one or two lookups. Some potential problems for scale arise from low-selectivity lookups, but that's a drop in the ocean compared to what's faced by SPARQL engines.
> 
> Cheers,
> Aidan
> 
> 
> [1] Jorge Pérez, Marcelo Arenas, Claudio Gutierrez: Semantics and complexity of SPARQL. ACM Trans. Database Syst. 34(3) (2009)
> 
>
Received on Tuesday, 16 April 2013 19:52:07 UTC