Re: service description vocabulary from Gregory Williams on 2009-09-28 (public-rdf-dawg@w3.org from July to September 2009)

From: Gregory Williams <greg@evilfunhouse.com>
Date: Mon, 28 Sep 2009 12:16:05 -0400
To: Steve Harris <steve.harris@garlik.com>
Cc: "public-rdf-dawg@w3.org Group" <public-rdf-dawg@w3.org>
Message-Id: <FEC7CF85-476B-48E2-907E-E158B838F4C3@evilfunhouse.com>

On Sep 26, 2009, at 5:06 AM, Steve Harris wrote:

>> On the other side of things, if I look at the dataset description  
>> and find that there's information about a single foaf:Person in the  
>> dataset and I want to retrieve that information, how am I meant to  
>> get to it if I don't know which of the million named graphs it's in?
>
> SELECT ?g ?x WHERE { GRAPH ?g { ?x a foaf:Person } } ?
>
> Maybe I misunderstood the question.

No, you got it right. Maybe I just gave a bad example :) I just don't  
want to end up with the only option of having to describe the whole  
dataset as simply the graph merge of the constituent graphs. Also, I  
know it's hard to argue spec points based on optimization stuff, but  
if you've got a large dataset and a not-so-intelligent query engine/ 
optimizer, this might be a really bad query to run if simply knowing  
what ?g is ahead of time could let you sidestep the whole issue.

> Let's not fixate on Void. If Void is not sufficient then the  
> community will come up with something more comprehensive.

Well, I'm torn between saying "yes, absolutely," and thinking that  
there are people (like the voiD folks) that are working on describing  
RDF graphs, but that the SPARQL dataset case is specific enough to  
SPARQL that maybe we should be providing the handful of properties to  
allow leveraging graph description vocabularies in the context of  
SPARQL datasets.

>>> while I'd think a simple way would be
>>>
>>> <endpoint> sd:datasetDescription <void-dataset-for-dataset> ;
>>> 	sd:defaultGraph <graph-name> ;
>>> 	sd:namedGraph <graph-name> .
>>
>> Again, I'm not sure how useful the <void-dataset-for-dataset> would  
>> be, since by this point you've lost the ability to discriminate  
>> between the named (and default) graphs.
>
> Well, it's a little murky anyway, both the protocol and query have  
> the ability to change the contents of the default graph.

In many cases you'd know that, though, right? If I provide a FROM  
clause or a default-graph-uri parameter, then I wouldn't expect the  
default graph description data to match up (since I've explicitly  
overridden them). In this case, either I can get the same graph  
description from one of the provided named graph descriptions (if the  
endpoint has a list of available graphs that I can use in FROM  
clauses) or I can get it from an external source if, for example, the  
endpoint is simply dereferencing an RDF URL I give it.

> Requiring systems to return everything in one graph could be onerous  
> for client and server, eg. in the 2M FOAF graph case, both the list  
> of graphs, and the description of the store will be large.

Agreed. I would, however, like this as an *option* for implementations.

.greg

Received on Monday, 28 September 2009 16:16:41 UTC