Re: Improving Organization of Govt. based Linked Data Projects from Kingsley Idehen on 2010-03-21 (public-lod@w3.org from March 2010)

From: Kingsley Idehen <kidehen@openlinksw.com>
Date: Sun, 21 Mar 2010 13:32:23 -0400
To: Dan Brickley <danbri@danbri.org>
CC: Hugh Glaser <hg@ecs.soton.ac.uk>, Linked Data community <public-lod@w3.org>
Message-ID: <4BA65827.1070604@openlinksw.com>

Dan Brickley wrote:
> On 21 Mar 2010, at 12:47, Hugh Glaser <hg@ecs.soton.ac.uk> wrote:
>
>> Hi Kingsley, I am right with you - finding stuff is hard.
>> But I do think we could make it easier for all of us.
>> Just the esw wiki alone requires me to put every set I create into a 
>> bunch of places
>
> 10 years ago, looking for RDF on the public Web was like looking for a 
> needle in a haystack. There wasnt much out there and it was poorly 
> linked. So a big part of the thinking that led to the foaf/rdfweb 
> design was to make discovery easier: if you find one rdf doc, you 
> should be able to find most of the rest by following seeAlso and other 
> kinds of links.
>
> Why isn't this enough? 
Because it doesn't lead me explicitly to:

1. RDF Data Set Archives
2. SPARQL endpoints.

It will get me to Linked Data based hypermedia resources, but that's one 
of three distinct items.

The bigger problem is really this, and it was the basis of the best 
practices, do SPARQL endpoint publishers when people hammering their 
endpoints with SPARQL CONSTRUCTS en route to constructing dumps that are 
then loaded into personal or service specific endpoints? Today, even 
with DBpedia outlining the components with absolute clarity, we still 
get loads of visitors attempting to empty the Quad Store via the SPARQL 
endpoint (even when they could simply go load the data sets themselves 
into an RDF store of their choice).
> Perhaps because many of the datasets are huge db exports, crawlers are 
> often overwhelmed and dissapear into depth-first holes? Or because we 
> don't publish triples about doc- and dataset-types in a 
> crawler-discoverable way?
I don't see this as a "Crawler Only" zone or solution. A project should 
make the following crystal clear, ultimately for its own good:

1. RDF Data Set Archives
2. SPARQL Endpoints
3. URIs pattern examples for Published Linked Data.

A single HTML+RDFa  (or HTML5 with RDFa or Microformats) document can 
express the above with clarity for visiting user agents.

>
> A wiki page is ok for initial bootstrap but we ought to outgrow that 
> soon...
Yes, of course, hence the items 1-3 above. But for now, its better than 
what's currently in play i.e., inconsistency.

Kingsley
>
> Dan
>


-- 

Regards,

Kingsley Idehen	      
President & CEO 
OpenLink Software     
Web: http://www.openlinksw.com
Weblog: http://www.openlinksw.com/blog/~kidehen
Twitter/Identi.ca: kidehen

Received on Sunday, 21 March 2010 17:32:58 UTC