- From: Ruben Verborgh <ruben.verborgh@ugent.be>
- Date: Fri, 21 Aug 2015 22:17:15 +0200
- To: Maxim Kolchin <kolchinmax@gmail.com>
- Cc: public-linked-data-fragments@w3.org, "semiot-project@googlegroups.com" <semiot-project@googlegroups.com>
Hi Maxim, Glad you're asking about this! > Is there a way (more reliable than the one described below) to > distinguish SPARQL endpoint and LDF server based on URL or an response > to a request sent to this URL? First of all, a little nitpick: you probably want to distinguish SPARQL endpoints from TPF servers. With "LDF", we mean all servers that publish Linked Data in some way (so this also includes SPARQL endpoints). A "TPF" (Triple Pattern Fragments) server is a specific kind of LDF server that offers access to triples by triple pattern. The reliable way to detect a TPF interface is to look inside of the response. The TPF interface is self-describing; it literally says clients what it does. For example, take the resource with URL http://bit.ly/1I0eNgt (I purposely used a URL shortener here so we can't see). If you get an RDF-based representation curl -L -H "Accept: text/turtle" http://bit.ly/1I0eNgt it will contain the following triples (reformatted for readability): <http://fragments.dbpedia.org/2015/en#dataset> a void:Dataset, hydra:Collection; void:subset <>; hydra:search [ hydra:template "http://fragments.dbpedia.org/2015/en{?subject,predicate,object}"; hydra:mapping [ hydra:variable "subject"; hydra:property rdf:subject. ],[ hydra:variable "predicate"; hydra:property rdf:predicate. ],[ hydra:variable "object"; hydra:property rdf:object. ] ]. Or, in human language: "This resource is a subset of the DBpedia 2015 dataset. You can search it by RDF subject, predicate, and object.” In other words: "this server supports the TPF interface". A SPARQL endpoint would not tell you any of this, because its interface is not self-describing. Summarizing: if a server replies with the above, it supports the TPF interface. If responses do not contain this, it is certainly not a TPF interface. Might be a SPARQL endpoint, might be something else. > One heuristic which could help is the status code of the response to a > request with empty query parameter. If the server responded with 5xx > or 4xx code then it's a SPARQL endpoint, because it expects non-empty > query parameter. So what we're discussing here is to test whether something is a SPARQL endpoint. According to the SPARQL 1.1 Protocol (http://www.w3.org/TR/sparql11-protocol/#query-operation): Client requests for this operation must include exactly one SPARQL query string (parameter name:query) So when no query is specified, the server should give an error (which, *if* RFC2616 is followed, should be 400, not 5xx). However, any non-SPARQL server is free to respond with any status code when an empty "query" parameter is appended to any of its URLs. For example, nothing in the TPF spec stops a server at http://example.org/fragments to give a 404 error if a user tries http://example.org/fragments?query= because that behavior is (purposely) unspecified. So finding out whether something is a SPARQL endpoint with 100% certainty is not possible with the current SPARQL 1.1 spec. Hope this helps, don't hesitate to ask more! Best, Ruben
Received on Friday, 21 August 2015 20:17:47 UTC