SD vocab updates: dataset descriptions

I've been updating the service description vocabulary based on the discussion at the F2F, and wanted to run the changes by everyone (since many of you weren't at the F2F).

The primary change is related to the link between a SPARQL service and a dataset description. This is where we're going to be punting a bit to other vocabularies such as voiD, and letting them do the actual dataset descriptions. However, as I discussed at the F2F[1], I thought we needed a way to link a dataset to its default graph since the default graph is very SPARQL-specific and not something likely to show up in a mroe general dataset description vocabulary. Given this, I've changed the SD vocab in the following ways:

* Added a sd:Dataset class. I'm hoping we can work with the voiD group to make sure this aligns with their notion of a dataset so voiD properties could be attached to a sd:Dataset.

* Replaced sd:datasetDescription with two new terms: sd:defaultDataset and sd:availableDataset. sd:defaultDataset links a sd:Service with a description of the default dataset used for query answering if none is provided by the query or protocol. It may use the defaultGraph property described below. sd:availableDataset links a sd:Service with a description of a dataset containing named graphs that may be used in FROM/FROM NAMED clauses.

* Added URL variants of the above two terms: sd:defaultDatasetURL and sd:availableDatasetURL. These are meant to allow linking not to the dataset description directly but to a dereferencable document that contains such descriptions. This allows the service description to be kept small while providing access to very large dataset descriptions.

* Added sd:defaultGraph term for linking a sd:Dataset with a description of the default graph in a dataset. For now I'm leaving the rdfs:range of this term open, allowing vocabularies like voiD to do it themselves.


I'd like to get some feedback on these changes from the group. In particular,  I'm curious about people's feelings on two issues:

(1) should the use of sd:availableDataset imply that the endpoint will only allow use of the named graphs in FROM/FROM NAMED clauses, or could it be used simply to link to locally cached/generated descrptions of commonly used datasets? If the latter, we (or somebody else) could coin a sd:feature IRI to indicate that an endpoint has the ability to dereference graph URLs.

(2) How do people feel about the URL variants of the dataset properties? I know several people had indicated that they wanted a way to link to dataset descriptions without including them in the service description and these terms were created to satisfy that need. However, the logistics of actually using the terms feel a bit strange to me (do you just search for any dataset instance in the retrieved RDF similar to how foaf:PersonalProfileDocument is used?) and there are other ways this could be handled (we might assume that if a dataset description isn't in the SD RDF then the dataset IRI is dereferencable and will return the description).

Thoughts?

thanks,
.greg


[1] Whiteboard darwing of the dataset description modeling from F2F2: <http://thefigtrees.net/lee/dl/sparql-IMG00009-20091103-1508.jpg>. The the unlabeled blue arc should be "default graph".

Received on Saturday, 21 November 2009 00:09:25 UTC