Re: service description vocabulary from Steve Harris on 2009-09-26 (public-rdf-dawg@w3.org from July to September 2009)

From: Steve Harris <steve.harris@garlik.com>
Date: Sat, 26 Sep 2009 09:36:26 +0100
To: "public-rdf-dawg@w3.org Group" <public-rdf-dawg@w3.org>
Message-Id: <E34F20F1-4883-4818-B040-A0983722AF52@garlik.com>
I may be out of sync on this discussion, as I haven't received mail  
recently, and the archive still appears to be partial, so apologies in  
advance if I've misunderstood.

On 25 Sep 2009, at 08:38, Alexandre Passant wrote:
> Hi,
>
> On 25 Sep 2009, at 03:41, Gregory Williams wrote:
>
>> Beyond what's currently listed in the vocab section of the service  
>> description page[1], I think we need a way to describe the dataset  
>> provided by the endpoint. This goes beyond what things like voiD  
>> provide which is a way to describe a single graph. Therefore, I'd  
>> like to suggest something like this:
>>
>> <endpoint> sd:datasetDescription [
>> 	sd:defaultGraph <void-dataset-for-default-graph> ;
>> 	sd:namedGraph [
>> 		sd:graphName <graph-name> ;
>> 		sd:graphDescription <void-dataset-for-named-graph> ;
>> 	] .
>> ] .
>
> So, in a quad store, you will describe each graph separately using  
> voiD ?
> Won't it be too much information in the SD, e.g. if I have 1 million  
> RDF files in my store, will have 1 million of voiD descriptions in  
> the SD.

Right, the FOAF store that backs http://foaf.qdos.com/ for example has  
around 2 million graphs in it, and we would obviously like to be able  
to describe its contents in a standard manner.

Also, at the risk of sounding like a broken record :) it's critical  
that there be a way to talk about the endpoint using relative URIs or  
some similar trick so that the software emitting the descriptions is  
not required to know the endpoint URI that the client is connected to.

Descriptions using <> as a subject are fine, but other things will be  
problematic.

> It may be more useful to directly querying each graph to get that  
> void-like information, if needed.
>
> What about having a simple description listing the list of graphs +  
> default one + the voiD description of the complete endpoint.
>
>>
>> The lack of naming symmetry between sd:defaultGraph (for default  
>> graphs) and sd:graphDescription (for named graphs) could probably  
>> be made better (maybe sd:defaultGraphDescription?), but this  
>> modeling allows each graph in the dataset to be described as well  
>> as things to be said about the entire dataset.
>
> Strictly speaking, isn't the default graph also a named graph (since  
> it generally also have its own URI).

I don't believe so, from §8 of SPARQL 1.0:

"A SPARQL query is executed against an RDF Dataset which represents a  
collection of graphs. An RDF Dataset comprises one graph, the default  
graph, which does not have a name, and zero or more named graphs,  
where each named graph is identified by an IRI."

I do this the area of default / named graphs in the SPARQL spec could  
do with some clarification, as I've not yet met two SPARQL  
implementers who agree on the meaning or intention.

> So
>
> <endpoint> sd:datasetDescription [
> 	sd:defaultGraph [
>                sd:graphName <graph-name> ;
> 		sd:graphDescription <void-dataset-for-default-graph> ;
>        ] .
> 	sd:namedGraph [
> 		sd:graphName <graph-name> ;
> 		sd:graphDescription <void-dataset-for-named-graph> ;
> 	] .
> ] .

Let's not bake Void into SPARQL. It's sufficient to say that it should  
be an RDF description, the exact vocabulary can be left open.

> while I'd think a simple way would be
>
> <endpoint> sd:datasetDescription <void-dataset-for-dataset> ;
> 	sd:defaultGraph <graph-name> ;
> 	sd:namedGraph <graph-name> .

That would be more palatable, but if it's possible for the client to  
request a description there needs to be some way for the client to  
request the description of a specific graph.

>> Additionally, I was hoping to get feedback on whether people think  
>> the vocab should distinguish between language extensions and  
>> supported features? Is this a meaningful distinction? For example,  
>> an extension that modifies the SPARQL syntax (e.g. to support the  
>> BINDINGS keyword) versus a feature describing the algorithm used to  
>> generate DESCRIBE results. The former clearly extends SPARQL, but  
>> the latter works within the existing constraints of SPARQL. Thoughts?

I don't think it's necessary to distinguish explicitly.

I would imagine that the client has some concept of the features it  
can make use of. Others it has no use for, whether their inside or  
outside the SPARQL spec are probably not relevant.

An exception would be something that tried to gather generic  
information about SPARQL endpoints (a catalogue of endpoints or so),  
but that's a bit of a corner case, and I doubt it make much difference  
anyway.

- Steve

-- 
Steve Harris
Garlik Limited, 2 Sheen Road, Richmond, TW9 1AE, UK
+44(0)20 8973 2465  http://www.garlik.com/
Registered in England and Wales 535 7233 VAT # 849 0517 11
Registered office: Thames House, Portsmouth Road, Esher, Surrey, KT10  
9AD
Received on Saturday, 26 September 2009 08:37:03 UTC