Re: How to publish SPARQL endpoint limits/metadata? from Kingsley Idehen on 2013-10-14 (public-lod@w3.org from October 2013)

From: Kingsley Idehen <kidehen@openlinksw.com>
Date: Mon, 14 Oct 2013 14:23:48 -0400
To: public-lod@w3.org
Message-ID: <525C36B4.4020308@openlinksw.com>
On 10/14/13 10:03 AM, Frans Knibbe | Geodan wrote:
> On 9-10-2013 14:07, Hugh Glaser wrote:
>> On 9 Oct 2013, at 12:46, Barry Norton<barrynorton@gmail.com>
>>   wrote:
>>
>>> On Wed, Oct 9, 2013 at 12:15 PM, Hugh Glaser<hg@ecs.soton.ac.uk>  wrote:
>>> [...]
>>>
>>> So having a separation between SPARQL Service Description and voiD would just be plain wrong.
>>> They must embrace each other, so that consumers can easily work out how to use what they think of as a "dataset".
>>>
>>> I would also add that if I take a REST-like view of the world, which I do for accessing a SPARQL endpoint (I am simply retrieving a document), the distinction between dataset and service becomes very blurred.
>>> Even calling it a "SPARQL Service Description" seems rather old-fashioned to me.
>>>
>>> Hugh, I tend to agree (certainly about calling them 'service descriptions', ugh). From a REST point of view, void:Datasets, named graphs (capable of RESTful interaction via Graph Store Protocol) and SPARQL query/update 'endpoints' (ugh again) are all resources that allow one to find other, more specific, resources.
>>>
>>> That said if we accept that one needs some up-front guidance on what those resources allow you to get to (a big 'if', if the REST community, but I don't think anyone in ours would be happy with just a media type) then we want them to be self-describing in RDF.
>> Always & everything!
>>> At the same time, the relationships we want to attach to the query/update endpoints are semi-distinct, no? You'd agree these are different classes of resource?
>> Yes, or perhaps I am saying different sub-classes?
>
> That is an interesting perspective. It is probably because I come from 
> the world of relational databases that I see a dataset as a collection 
> of data and the endpoint as a means of getting those data. In my 
> setup, I have made a single SPARQL endpoint that serves many datasets. 
> I hope there is nothing wrong with that approach... I can also imagine 
> a single dataset having multiple SPARQL endpoints. For example, a free 
> public endpoint and an endpoint for paying users with a higher service 
> level.

Yes of course.

Maybe its just better to think of "data" instead of "dataset"  i.e., you 
have data accessible via a variety of data access services.

>
> If datasets and endpoints would be different subclasses of the same 
> parent class, what would be properties of the parent class that the 
> two could share?

datasets and endpoints are disjoint.

>
> As an aside,  I now notice a mistake in the subject header of the 
> thread if I maintain my view on the distinction between datasets and 
> endpoints: If a SPARQL endpoint is not a data collection, it can not 
> have metadata (data about data) :-)

It can have a description of the services it provides. It can also be 
associated with data describing the data for which it provides data 
access services.

>
>> Thinking of it that way, I then look at Frans' list of the kind of thing he would like to be able to say about endpoints.
>> It seems that at least the following might be common to almost any delivery mechanism for datasets:
>>
>> 	• The time period of the next scheduled downtime
>> 	• (the URI of) a document that contains a human readable SLA or fair use policy for the service
>> 	• URIs of mirrors
>
> Yes, I thought about that too. These are properties that ideally 
> should be communicated at a higher conceptual level. I briefly looked 
> at HTTP headers but I could not find anything there. Perhaps there is 
> a need to have a lightweight vocabulary for describing any data access 
> API? 

This is basically the task at hand i.e., add some descriptions oriented 
towards the data access services esp., as Linked Data is all about Data 
Connectivity as opposed to Database Connectivity.

> Or would that be overcomplicating things? The Linked Data API 
> vocabulary 
> <http://linked-data-api.googlecode.com/svn/trunk/vocab/api.ttl#> does 
> not seem entirely appropriate, because it seems to exclude SPARQL 
> endpoints.

As indicated above, we need cater to the needs of data access services 
(what the endpoint offers) and actual data (the collection of statements 
partitioned using named graph IRIs).

Links:

1. http://bit.ly/1biJTBZ -- Illustrating Data Access Pathways (note: 
ODBC, JDBC, ADO.NET etc. have long established metadata patterns for 
data access services) .

>
> Regards,
> Frans
>
>


-- 

Regards,

Kingsley Idehen	
Founder & CEO
OpenLink Software
Company Web: http://www.openlinksw.com
Personal Weblog: http://www.openlinksw.com/blog/~kidehen
Twitter/Identi.ca handle: @kidehen
Google+ Profile: https://plus.google.com/112399767740508618350/about
LinkedIn Profile: http://www.linkedin.com/in/kidehen
Attachments

application/pkcs7-signature attachment: S/MIME Cryptographic Signature
Received on Monday, 14 October 2013 18:24:07 UTC