Re: service description vocabulary

On 28 Sep 2009, at 17:06, Gregory Williams wrote:

> On Sep 26, 2009, at 4:36 AM, Steve Harris wrote:
>
>>> So, in a quad store, you will describe each graph separately using  
>>> voiD ?
>>> Won't it be too much information in the SD, e.g. if I have 1  
>>> million RDF files in my store, will have 1 million of voiD  
>>> descriptions in the SD.
>>
>> Right, the FOAF store that backs http://foaf.qdos.com/ for example  
>> has around 2 million graphs in it, and we would obviously like to  
>> be able to describe its contents in a standard manner.
>
> We've talked about having both a way to embed the dataset  
> description in the SD document and also a way to link to a URL where  
> the dataset description can be retrieved. Would that satisfy your  
> needs here?

Yes, I think so.

>> Also, at the risk of sounding like a broken record :) it's critical  
>> that there be a way to talk about the endpoint using relative URIs  
>> or some similar trick so that the software emitting the  
>> descriptions is not required to know the endpoint URI that the  
>> client is connected to.
>>
>> Descriptions using <> as a subject are fine, but other things will  
>> be problematic.
>
> I'm not sure what it is that you're referring to as I thought  
> everything that's been proposed regarding the vocabulary so far is  
> in line with your relative URI requirement.

I was worried by the presence of "<endpoint>".

>>> So
>>>
>>> <endpoint> sd:datasetDescription [
>>> 	sd:defaultGraph [
>>>              sd:graphName <graph-name> ;
>>> 		sd:graphDescription <void-dataset-for-default-graph> ;
>>>      ] .
>>> 	sd:namedGraph [
>>> 		sd:graphName <graph-name> ;
>>> 		sd:graphDescription <void-dataset-for-named-graph> ;
>>> 	] .
>>> ] .
>>
>> Let's not bake Void into SPARQL. It's sufficient to say that it  
>> should be an RDF description, the exact vocabulary can be left open.
>
> Yeah, agreed. The idea was never to bake voiD into the spec, but to  
> think of it as a possible best practice to use for describing the  
> datasets (and it makes examples easier to discuss since we've  
> already got a vocabulary that people know).

OK, so lets use something more neutral than <void-dataset-for-named- 
graph> in examples.

>>> while I'd think a simple way would be
>>>
>>> <endpoint> sd:datasetDescription <void-dataset-for-dataset> ;
>>> 	sd:defaultGraph <graph-name> ;
>>> 	sd:namedGraph <graph-name> .
>>
>> That would be more palatable, but if it's possible for the client  
>> to request a description there needs to be some way for the client  
>> to request the description of a specific graph.
>
> How about the two variants (embedded and linked) like so:
>
> sd:namedGraph [
> 	sd:graphName <graph-name> ;
> 	sd:graphDescription <void-dataset-for-named-graph> ;
> ] .
>
> (this is example from before), and:
>
> sd:namedGraph [
> 	sd:graphName <graph-name> ;
> 	sd:graphDescriptionURL <document-url-with-void-dataset-for-named- 
> graph> ;
> ] .
>
> It's not quite as simple as just "<> sd:namedGraph <graph-name>",  
> but it keeps the actual dataset description out of service  
> description document while giving you a place to go get it.

It does mean that the description of foaf.qdos.com as above will be 4M  
triples, if we went with this route. I'm comfortable with that. I  
suspect that in reality there's no point individually describing 2M  
FOAF graphs - you might as well just fetch the data.

- Steve

-- 
Steve Harris
Garlik Limited, 2 Sheen Road, Richmond, TW9 1AE, UK
+44(0)20 8973 2465  http://www.garlik.com/
Registered in England and Wales 535 7233 VAT # 849 0517 11
Registered office: Thames House, Portsmouth Road, Esher, Surrey, KT10  
9AD

Received on Tuesday, 29 September 2009 07:50:07 UTC