Re: Discovering a query endpoint associated with a given Linked Data resource

Hi Nandana,

no, we haven't investigated that further - for the "why", it is hard to 
examine that at scale (you could of course ask all data providers, but...).

For the non-discoverable VoIDs, there is also a methodological problem - 
how would we know that they exist if we are not able to discover them?

Best,
Heiko



Am 26.08.2015 um 12:40 schrieb Nandana Mihindukulasooriya:
> Hi Heiko,
>
> Thanks a lot for the pointer to the paper.
>
> In your experiment, were you able to get some insights on *why* data 
> publishers are not providing VoID descriptions when it is applicable 
> to do so (leaving out single FOAF documents etc.) ?
>
> [[Approaches using proposed methods such as VoID and the provenance 
> vocabulary are scarcely in use (and sometimes not implemented 
> according to the specification), they lead to a valid SPARQL endpoint 
> in less than 1% of all cases.]]
>
> Also did you find many occasions where actually a VoID description is 
> available, but it is not discoverable according to the VoID spec (such 
> as the case you mention about not having the description in the root 
> level but in another level). For instance, 
> http://dbpedia.org/void/Dataset exists but is not in 
> http://dbpedia.org/.well-known/void and the resources don't provide a 
> back-link.
>
> Best Regards,
> Nandana
>
>
> On Wed, Aug 26, 2015 at 12:05 PM, Heiko Paulheim 
> <heiko@informatik.uni-mannheim.de 
> <mailto:heiko@informatik.uni-mannheim.de>> wrote:
>
>     Hi all,
>
>     two years ago, we conducted an empirical experiment to find out
>     how promising the different approaches to discover SPARQL
>     endpoints are. The results were rather disappointing, see [1].
>
>     Executive summary: rather than trying to find VoID descriptions
>     (which rarely exist), querying catalogues like datahub seems more
>     promising (higher recall at least, precision is lower).
>
>     Hth.
>
>     Best,
>     Heiko
>
>     [1] http://www.heikopaulheim.com/docs/iswc2013_poster.pdf
>
>
>
>
>     Am 26.08.2015 um 11:50 schrieb Nandana Mihindukulasooriya:
>>     Thanks all for the pointers.
>>
>>     Yes, it seems it is quite rare in practice. I tried several hosts
>>     that provide Linked Data resources but couldn't find ones that
>>     provide a VoID description in .well-known/void.
>>
>>     I guess there is a higher technical barrier in making that
>>     description available in the given location compared to providing
>>     that information in the response in most cases. So probably the
>>     pragmatic thing to do would be to include this information either
>>     in the content or as a Link relation header using the void
>>     properties when dereferenced.
>>
>>     So I can use the void:inDataset back-link mechanism [1] and point
>>     to a VoID description that will have the necessary information
>>     about the query endpoints.
>>
>>     -----
>>     dbpedia:Sri_Lanka void:inDataset _:DBpedia .
>>     _:DBpedia a void:Dataset;
>>         void:sparqlEndpoint <http://dbpedia.org/sparql>;
>>       void:uriLookupEndpoint
>>     <http://fragments.dbpedia.org/2014/en?subject=> .
>>     ------
>>     or
>>
>>     ----
>>     Link: <http://dbpedia.org/void/Dataset>;
>>     rel="http://rdfs.org/ns/void#inDataset"
>>     ----
>>
>>     Best Regards,
>>     Nandana
>>
>>     [1] http://www.w3.org/TR/void/#discovery-links
>>
>>     On Wed, Aug 26, 2015 at 11:05 AM, Miel Vander Sande
>>     <miel.vandersande@ugent.be <mailto:miel.vandersande@ugent.be>> wrote:
>>
>>         Hi Nandana,
>>
>>         I guess VoID would be the best fit
>>
>>         In case of LDF you could use
>>
>>         <...> void:uriLookupEndpoint
>>         <http://fragments.dbpedia.org/2014/en?subject=>
>>
>>         But wether these exists in practice? Probably not. I'd leave
>>         it up to the dereference publisher to provide this triple in
>>         te response, rather than doing the .well_known thing.
>>
>>         Best,
>>
>>         Miel
>>
>>         On 26 Aug 2015, at 10:57, Víctor Rodríguez Doncel
>>         <vrodriguez@fi.upm.es <mailto:vrodriguez@fi.upm.es>> wrote:
>>
>>         >
>>         > Well, you might try to look in this folder location:
>>         > .well-known/void
>>         > And possibly find a "void:sparqlEndpoint".
>>         >
>>         > But this would be too good to be true.
>>         >
>>         > Regards,
>>         > Víctor
>>         >
>>         > El 26/08/2015 10:45, Nandana Mihindukulasooriya escribió:
>>         >> Hi,
>>         >>
>>         >> Is there a standard or widely used way of discovering a
>>         query endpoint (SPARQL/LDF) associated with a given Linked
>>         Data resource?
>>         >>
>>         >> I know that a client can use the "follow your nose" and
>>         related link traversal approaches such as [1], but if I
>>         wonder if it is possible to have a hybrid approach in which
>>         the dereferenceable Linked Data resources that optionally
>>         advertise query endpoint(s) in a standard way so that the
>>         clients can perform queries on related data.
>>         >>
>>         >> To clarify the use case a bit, when a client dereferences
>>         a resource URI it gets a set of triples (an RDF graph) [2]. 
>>         In some cases, it might be possible that the returned graph
>>         could be a subgraph of a named graph / default graph of an
>>         RDF dataset. The client wants to discover if a query endpoint
>>         that exposes the relevant dataset, if one is available.
>>         >>
>>         >> For example, something like the following using the
>>         "search" link relation [3].
>>         >>
>>         >> ------
>>         >> HEAD /resource/Sri_Lanka
>>         >> Host: http://dbpedia.org
>>         >> ------
>>         >> 200 OK
>>         >> Link: <http://dbpedia.org/sparql>; rel="search";
>>         type="sparql",
>>         <http://fragments.dbpedia.org/2014/en#dataset>; rel="search";
>>         type="ldf"
>>         >> ... other headers ...
>>         >> ------
>>         >>
>>         >> Best Regards,
>>         >> Nandana
>>         >>
>>         >> [1]
>>         http://swsa.semanticweb.org/sites/g/files/g524521/f/201507/DissertationOlafHartig_0.pdf
>>         >> [2]
>>         http://www.w3.org/TR/2014/REC-rdf11-concepts-20140225/#section-rdf-graph
>>         >> [3]
>>         http://www.iana.org/assignments/link-relations/link-relations.xhtml
>>         >
>>         >
>>         > --
>>         > Víctor Rodríguez-Doncel
>>         > D3205 - Ontology Engineering Group (OEG)
>>         > Departamento de Inteligencia Artificial
>>         > Facultad de Informática
>>         > Universidad Politécnica de Madrid
>>         >
>>         > Campus de Montegancedo s/n
>>         > Boadilla del Monte-28660 Madrid, Spain
>>         > Tel. (+34) 91336 3672 <tel:%28%2B34%29%2091336%203672>
>>         > Skype: vroddon3
>>         >
>>         >
>>         > ---
>>         > El software de antivirus Avast ha analizado este correo
>>         electrónico en busca de virus.
>>         > https://www.avast.com/antivirus
>>         >
>>         >
>>
>
>     -- 
>     Prof. Dr. Heiko Paulheim
>     Data and Web Science Group
>     University of Mannheim
>     Phone:+49 621 181 2646 <tel:%2B49%20621%20181%202646>
>     B6, 26, Room C1.08
>     D-68159 Mannheim
>
>     Mail:heiko@informatik.uni-mannheim.de
>     <mailto:heiko@informatik.uni-mannheim.de>
>     Web:www.heikopaulheim.com <http://www.heikopaulheim.com>
>
>

-- 
Prof. Dr. Heiko Paulheim
Data and Web Science Group
University of Mannheim
Phone: +49 621 181 2646
B6, 26, Room C1.08
D-68159 Mannheim

Mail: heiko@informatik.uni-mannheim.de
Web: www.heikopaulheim.com

Received on Wednesday, 26 August 2015 11:30:33 UTC