- From: Miguel <miguel.ceriani@gmail.com>
- Date: Fri, 21 Aug 2015 22:06:32 +0000
- To: Ruben Verborgh <ruben.verborgh@ugent.be>, Maxim Kolchin <kolchinmax@gmail.com>
- Cc: public-linked-data-fragments@w3.org, "semiot-project@googlegroups.com" <semiot-project@googlegroups.com>
- Message-ID: <CALWU=RvO2iudrw-E8B4EhQ7Yg1j9eYqKg+MjD8PBn2S-2=p6eQ@mail.gmail.com>
Hi Maxim and Ruben, for completeness I would add that as a fact also SPARQL endpoints MAY self-describe themselves in RDF using the "SPARQL 1.1 Service Description Vocabulary" [1]. Some endpoints (16.64%, according to [2]) do so, for example DBPedia [3]. Nevertheless, a lot of endpoints does not, so at the moment you could use it just to identify some "well behaved" SPARQL endpoints. By the way, I guess some other "brute force" methods can be used to test if an URL points to a SPARQL endpoint, like sending a very simple SPARQL 1.0 query. Best, Miguel [1] http://www.w3.org/TR/sparql11-service-description/ [2] http://sparqles.ai.wu.ac.at/discoverability [3] http://live.dbpedia.org/sparql Il giorno ven 21 ago 2015 alle ore 16:18 Ruben Verborgh < ruben.verborgh@ugent.be> ha scritto: > Hi Maxim, > > Glad you're asking about this! > > > Is there a way (more reliable than the one described below) to > > distinguish SPARQL endpoint and LDF server based on URL or an response > > to a request sent to this URL? > > First of all, a little nitpick: > you probably want to distinguish SPARQL endpoints from TPF servers. > With "LDF", we mean all servers that publish Linked Data in some way > (so this also includes SPARQL endpoints). > A "TPF" (Triple Pattern Fragments) server is a specific kind of LDF server > that offers access to triples by triple pattern. > > The reliable way to detect a TPF interface is to look inside of the > response. > The TPF interface is self-describing; it literally says clients what it > does. > For example, take the resource with URL http://bit.ly/1I0eNgt > (I purposely used a URL shortener here so we can't see). > If you get an RDF-based representation > curl -L -H "Accept: text/turtle" http://bit.ly/1I0eNgt > it will contain the following triples (reformatted for readability): > > <http://fragments.dbpedia.org/2015/en#dataset> a void:Dataset, > hydra:Collection; > void:subset <>; > hydra:search [ > hydra:template " > http://fragments.dbpedia.org/2015/en{?subject,predicate,object}"; > hydra:mapping [ > hydra:variable "subject"; > hydra:property rdf:subject. > ],[ > hydra:variable "predicate"; > hydra:property rdf:predicate. > ],[ > hydra:variable "object"; > hydra:property rdf:object. > ] > ]. > > Or, in human language: > "This resource is a subset of the DBpedia 2015 dataset. > You can search it by RDF subject, predicate, and object.” > In other words: "this server supports the TPF interface". > > A SPARQL endpoint would not tell you any of this, > because its interface is not self-describing. > > Summarizing: if a server replies with the above, it supports the TPF > interface. > If responses do not contain this, it is certainly not a TPF interface. > Might be a SPARQL endpoint, might be something else. > > > One heuristic which could help is the status code of the response to a > > request with empty query parameter. If the server responded with 5xx > > or 4xx code then it's a SPARQL endpoint, because it expects non-empty > > query parameter. > > So what we're discussing here is to test whether something is a SPARQL > endpoint. > According to the SPARQL 1.1 Protocol ( > http://www.w3.org/TR/sparql11-protocol/#query-operation): > Client requests for this operation must include > exactly one SPARQL query string (parameter name:query) > So when no query is specified, the server should give an error > (which, *if* RFC2616 is followed, should be 400, not 5xx). > > However, any non-SPARQL server is free to respond with any status code > when an empty "query" parameter is appended to any of its URLs. > For example, nothing in the TPF spec stops a server at > http://example.org/fragments > to give a 404 error if a user tries > http://example.org/fragments?query= > because that behavior is (purposely) unspecified. > > So finding out whether something is a SPARQL endpoint > with 100% certainty is not possible with the current SPARQL 1.1 spec. > > Hope this helps, don't hesitate to ask more! > > Best, > > Ruben > >
Received on Friday, 21 August 2015 22:07:10 UTC