Re: Finding SPAQL endpoints? from Michael Hausenblas on 2009-03-10 (public-lod@w3.org from March 2009)

From: Michael Hausenblas <michael.hausenblas@deri.org>
Date: Tue, 10 Mar 2009 07:39:04 +0000
To: Kingsley Idehen <kidehen@openlinksw.com>, Daniel Schwabe <dschwabe@inf.puc-rio.br>, Giovanni Tummarello <giovanni.tummarello@deri.org>
CC: Linked Data community <public-lod@w3.org>
Message-ID: <C5DBC998.2A5F%michael.hausenblas@deri.org>

Daniel, Kingsley, Giovanni,

My 0.02€:

To have more than one option for discovery at hand is definitely no
drawback, I'm trying to outreach and gather reactions from a wider audience
at [1].

@Giovanni: re applications, precisely my point [2] - there needs to be an
incentive beyond being a good citizen on planet LOD, getting papers accepted
and acknowledgement by peers in the group. We need to demonstrate people the
power and they will start to pick up - but they actually start to do, when
looking at recent developments ;)

@Kingsley: Good practice notes? Sure, as you know we have them [3] and they
should be extended and updated (re this issue, RDFa and maybe more IMHO)

Cheers,
      Michael

[1] http://webofdata.wordpress.com/2009/03/10/sparql-endpoint-discovery/
[2] http://sw-app.org/pub/exploit-lod-webapps-IEEEIC-preprint.pdf
[3] http://www4.wiwiss.fu-berlin.de/bizer/pub/LinkedDataTutorial/

Cheers,
      Michael

-- 
Dr. Michael Hausenblas
DERI - Digital Enterprise Research Institute
National University of Ireland, Lower Dangan,
Galway, Ireland, Europe
Tel. +353 91 495730
http://sw-app.org/about.html
http://webofdata.wordpress.com/


> From: Kingsley Idehen <kidehen@openlinksw.com>
> Date: Mon, 09 Mar 2009 20:15:00 -0400
> To: Daniel Schwabe <dschwabe@inf.puc-rio.br>
> Cc: Linked Data community <public-lod@w3.org>
> Subject: Re: Finding SPAQL endpoints?
> Resent-From: Linked Data community <public-lod@w3.org>
> Resent-Date: Tue, 10 Mar 2009 00:15:43 +0000
> 
> Daniel Schwabe wrote:
>> All,
>> the sitemap.xml solution works IF everybody (or most) have the
>> robots.txt or the sitemap.xml at the root directory. So, conceptually
>> speaking, it should be the way to go.
>> 
>> But a quick test on the LOD cloud returned 404 for many if not most
>> sites for both sitemap.xml and robots.txt...
>> Curiously, for many of those without a sitemap.xml, the
>> <c-name>/sparql URI format to access the SPAQL endpoint DOES work...
>> 
>> So something is still missing. Either each dataspace mantainer that is
>> willing to provide the SPARQL endpoint also provides a (even if
>> minimal) sitemap.xml or voiD description, or at least follows this
>> convention.
>> This would greatly enhance the accessibility of the data, and enable
>> tools to automatically find them as needed...
>> 
>> Cheers
>> D
> Daniel,
> 
> +1
> 
> Clearly we need to document the best practices somewhere :-)
> 
> 
> 
> Kingsley
>> 
>> 
>> Sergio Fernández wrote:
>>> On Sat, 2009-03-07 at 00:36 -0300, Daniel Schwabe wrote:
>>>   
>>>> I could query the site for its sitemap extension (would it always be
>>>> <home url>/sitemap.xml?
>>>>     
>>> 
>>> Yes, you can do it in a programmatic way. But that URL (/sitemap.xml),
>>> even it's common used, it's not mandatory, so you can't use it as a
>>> constant. But there is one way, not so direct, but at least one that is
>>> standard:
>>> 
>>> 1) From /robots.txt you can take the Sitemap's URL ("Sitemap:" as [1]
>>> specifies)
>>> 2) According the extension proposed by DERI [2], you can check if the
>>> sitemap points a SPARQL enpoint looking for the
>>> sc:sparqlEndpointLocation element.
>>> 
>>> Hope that helps.
>>> 
>>> Best,
>>> 
>>> [1] http://www.sitemaps.org/protocol.php
>>> [2] http://sw.deri.org/2007/07/sitemapextension/
>>> 
>>>   
>> 
>> -- 
>> Daniel Schwabe
>> Tel:+55-21-3527 1500 r. 4356
>> Fax: +55-21-3527 1530
>> http://www.inf.puc-rio.br/~dschwabe  Dept. de Informatica, PUC-Rio
>> R. M. de S. Vicente, 225
>> Rio de Janeiro, RJ 22453-900, Brasil
>> 
> 
> 
> -- 
> 
> 
> Regards,
> 
> Kingsley Idehen       Weblog: http://www.openlinksw.com/blog/~kidehen
> President & CEO 
> OpenLink Software     Web: http://www.openlinksw.com
> 
> 
> 
> 
>

Received on Tuesday, 10 March 2009 07:39:47 UTC