Re: Is there a way to automatically distinguish SPARQL endpoint and LDF server?

Hi Maxim and Ruben,
for completeness I would add that as a fact also SPARQL endpoints MAY
self-describe themselves in RDF using the "SPARQL 1.1 Service Description
Vocabulary" [1].
Some endpoints (16.64%, according to [2]) do so, for example DBPedia [3].
Nevertheless, a lot of endpoints does not, so at the moment you could use
it just to identify some "well behaved" SPARQL endpoints.

By the way, I guess some other "brute force" methods can be used to test if
an URL points to a SPARQL endpoint, like sending a very simple SPARQL 1.0
query.

Best,
Miguel

[1] http://www.w3.org/TR/sparql11-service-description/
[2] http://sparqles.ai.wu.ac.at/discoverability
[3] http://live.dbpedia.org/sparql

Il giorno ven 21 ago 2015 alle ore 16:18 Ruben Verborgh <
ruben.verborgh@ugent.be> ha scritto:

> Hi Maxim,
>
> Glad you're asking about this!
>
> > Is there a way (more reliable than the one described below) to
> > distinguish SPARQL endpoint and LDF server based on URL or an response
> > to a request sent to this URL?
>
> First of all, a little nitpick:
> you probably want to distinguish SPARQL endpoints from TPF servers.
> With "LDF", we mean all servers that publish Linked Data in some way
> (so this also includes SPARQL endpoints).
> A "TPF" (Triple Pattern Fragments) server is a specific kind of LDF server
> that offers access to triples by triple pattern.
>
> The reliable way to detect a TPF interface is to look inside of the
> response.
> The TPF interface is self-describing; it literally says clients what it
> does.
> For example, take the resource with URL http://bit.ly/1I0eNgt
> (I purposely used a URL shortener here so we can't see).
> If you get an RDF-based representation
>     curl -L -H "Accept: text/turtle" http://bit.ly/1I0eNgt
> it will contain the following triples (reformatted for readability):
>
>     <http://fragments.dbpedia.org/2015/en#dataset> a void:Dataset,
> hydra:Collection;
>         void:subset <>;
>         hydra:search [
>           hydra:template "
> http://fragments.dbpedia.org/2015/en{?subject,predicate,object}";
>           hydra:mapping [
>             hydra:variable "subject";
>             hydra:property rdf:subject.
>           ],[
>             hydra:variable "predicate";
>             hydra:property rdf:predicate.
>           ],[
>             hydra:variable "object";
>             hydra:property rdf:object.
>           ]
>         ].
>
> Or, in human language:
> "This resource is a subset of the DBpedia 2015 dataset.
>  You can search it by RDF subject, predicate, and object.”
> In other words: "this server supports the TPF interface".
>
> A SPARQL endpoint would not tell you any of this,
> because its interface is not self-describing.
>
> Summarizing: if a server replies with the above, it supports the TPF
> interface.
> If responses do not contain this, it is certainly not a TPF interface.
> Might be a SPARQL endpoint, might be something else.
>
> > One heuristic which could help is the status code of the response to a
> > request with empty query parameter. If the server responded with 5xx
> > or 4xx code then it's a SPARQL endpoint, because it expects non-empty
> > query parameter.
>
> So what we're discussing here is to test whether something is a SPARQL
> endpoint.
> According to the SPARQL 1.1 Protocol (
> http://www.w3.org/TR/sparql11-protocol/#query-operation):
>     Client requests for this operation must include
>     exactly one SPARQL query string (parameter name:query)
> So when no query is specified, the server should give an error
> (which, *if* RFC2616 is followed, should be 400, not 5xx).
>
> However, any non-SPARQL server is free to respond with any status code
> when an empty "query" parameter is appended to any of its URLs.
> For example, nothing in the TPF spec stops a server at
>     http://example.org/fragments
> to give a 404 error if a user tries
>     http://example.org/fragments?query=
> because that behavior is (purposely) unspecified.
>
> So finding out whether something is a SPARQL endpoint
> with 100% certainty is not possible with the current SPARQL 1.1 spec.
>
> Hope this helps, don't hesitate to ask more!
>
> Best,
>
> Ruben
>
>

Received on Friday, 21 August 2015 22:07:10 UTC