Re: Restpark - Minimal RESTful API for querying RDF triples from Jerven Bolleman on 2013-04-16 (public-lod@w3.org from April 2013)

From: Jerven Bolleman <me@jerven.eu>
Date: Tue, 16 Apr 2013 21:55:11 +0200
To: Jerven Bolleman <me@jerven.eu>
Cc: Aidan Hogan <aidan.hogan@deri.org>, public-lod@w3.org
Message-Id: <3C4B425A-EA57-450E-ACBD-1969EE00BB25@jerven.eu>
In summary: if the problem you have is P-Space complete its going to run in P-Space, unless you so some really smart stuff.
restpark is refusing to try to solve a P-Space complete problem pushing all this kind of work back to the programmer using the restpark API.
This approach does not scale in manhours...

On Apr 16, 2013, at 9:51 PM, Jerven Bolleman wrote:

> Hi Aidan,
> 
> I think that you are jumping to conclusions. While a RDF store that can hold any RDF data will answer queries 
> in P-Space, however that is an extremely generic solution. There is no need that a fully SPARQL1.1 query compliant engine needs to hold 
> true to this [1]. 
> 
> This is a thought exercise. Lets assume you have very limited data in your restpark endpoint.
> 
> example:a constant:predicate object:a
> example:b constant:predicate object:b
> .
> .
> .
> example:z constant:predicate object:z
> 
> and so on. A subject is never an object in this dataset.
> 
> Then in this specific case 
> 
> /restpark?subject={subject}&predicate={predicate}&object={object}
> can be translated directly from
> /sparql?query=SELECT * WHERE {?subject ?predicate ?object}
> if your dataset does not allow any joins then your query engine can translate directly from the sparql variant into the restpark call.
> 
> So in this limited case SPARQL given this data is in the same complexity call as restpark.
> In other words restpark can be hidden behind a SPARQL query engine that limits itself to only answering
> /sparql?query=SELECT * WHERE {?subject ?predicate ?object}
> Actually looking at the Jena and Sesame API's for triple source running a SPARQL endpoint on top of your API is dead simple. 
> 
> So I postulate that for your this use case SPARQL scales as well as restpark (the restpark API being a proper subset of SPARQL)
> 
> And you may say but thats insane but I have done similar work for making tab-delimited bioinformatics databases accessible this way [2]. 
> 
> The credo that SPARQL does not scale in comparison to restpark does not hold true because restpark cannot answer queries that SPARQL endpoints could.
> Not competing in most of the performance race is not the same as winning the race ;)
> 
> Regards,
> Jerven
> 
> [1] If full SPARQL update is supported then things change.
> [2] See https://github.com/JervenBolleman/sparql-bed/ for a simple example (just implement the sesame triple source API).
> 
> On Apr 16, 2013, at 9:28 PM, Aidan Hogan wrote:
> 
>> On 16/04/2013 20:02, Kingsley Idehen wrote:
>>> On 4/16/13 2:53 PM, Aidan Hogan wrote:
>>>> On 16/04/2013 19:21, Kingsley Idehen wrote:
>>>>> SPARQL scales† ...
>>>> 
>>>> † with the minor exception of answering SPARQL queries.
>>>> 
>>>> Cheers,
>>>> Aidan
>>>> 
>>>> 
>>> Please clarify what you mean, especially with regards to how that could
>>> be deficient to what some RESTful API might offer.
>> 
>> Sure.
>> 
>> 
>> In theory, SPARQL evaluation (answering a SPARQL query) is P-Space Complete [1]. Aside from some nasty OPTIONAL examples it's NP-Complete in query complexity since you're doing graph matching (homomorphisms). It doesn't scale and only selective parts can be parallelised.
>> 
>> In practice, it is not at all difficult to find (interesting) SPARQL queries that existing engines cannot run completely and correctly. I (unfortunately) find them all the time without even trying.
>> 
>> SPARQL evaluation does not scale, cannot scale, will never scale.
>> 
>> But maybe I misunderstand. What do you mean when you say "SPARQL scales"? :)
>> 
>> 
>> With respect to Restpark, binary search is logarithmic and answering a Restpark query would require one or two lookups. Some potential problems for scale arise from low-selectivity lookups, but that's a drop in the ocean compared to what's faced by SPARQL engines.
>> 
>> Cheers,
>> Aidan
>> 
>> 
>> [1] Jorge Pérez, Marcelo Arenas, Claudio Gutierrez: Semantics and complexity of SPARQL. ACM Trans. Database Syst. 34(3) (2009)
>> 
>> 
>
Received on Tuesday, 16 April 2013 19:55:45 UTC