Re: Finding SPAQL endpoints?

Daniel Schwabe wrote:
> All,
> the sitemap.xml solution works IF everybody (or most) have the 
> robots.txt or the sitemap.xml at the root directory. So, conceptually 
> speaking, it should be the way to go.
>
> But a quick test on the LOD cloud returned 404 for many if not most 
> sites for both sitemap.xml and robots.txt...
> Curiously, for many of those without a sitemap.xml, the 
> <c-name>/sparql URI format to access the SPAQL endpoint DOES work...
>
> So something is still missing. Either each dataspace mantainer that is 
> willing to provide the SPARQL endpoint also provides a (even if 
> minimal) sitemap.xml or voiD description, or at least follows this 
> convention.
> This would greatly enhance the accessibility of the data, and enable 
> tools to automatically find them as needed...
>
> Cheers
> D
Daniel,

+1

Clearly we need to document the best practices somewhere :-)



Kingsley
>
>
> Sergio Fernández wrote:
>> On Sat, 2009-03-07 at 00:36 -0300, Daniel Schwabe wrote:
>>   
>>> I could query the site for its sitemap extension (would it always be 
>>> <home url>/sitemap.xml? 
>>>     
>>
>> Yes, you can do it in a programmatic way. But that URL (/sitemap.xml),
>> even it's common used, it's not mandatory, so you can't use it as a
>> constant. But there is one way, not so direct, but at least one that is
>> standard:
>>
>> 1) From /robots.txt you can take the Sitemap's URL ("Sitemap:" as [1]
>> specifies)
>> 2) According the extension proposed by DERI [2], you can check if the
>> sitemap points a SPARQL enpoint looking for the
>> sc:sparqlEndpointLocation element.
>>
>> Hope that helps.
>>
>> Best,
>>
>> [1] http://www.sitemaps.org/protocol.php
>> [2] http://sw.deri.org/2007/07/sitemapextension/
>>
>>   
>
> -- 
> Daniel Schwabe
> Tel:+55-21-3527 1500 r. 4356
> Fax: +55-21-3527 1530
> http://www.inf.puc-rio.br/~dschwabe  Dept. de Informatica, PUC-Rio
> R. M. de S. Vicente, 225
> Rio de Janeiro, RJ 22453-900, Brasil
>


-- 


Regards,

Kingsley Idehen       Weblog: http://www.openlinksw.com/blog/~kidehen
President & CEO 
OpenLink Software     Web: http://www.openlinksw.com

Received on Tuesday, 10 March 2009 00:15:41 UTC