Re: Comments on SPARQL 1.1 Service Description WD 20091022 from Gregory Williams on 2010-01-12 (public-rdf-dawg-comments@w3.org from January 2010)

From: Gregory Williams <greg@evilfunhouse.com>
Date: Mon, 11 Jan 2010 20:47:12 -0500
To: Leigh Dodds <leigh.dodds@talis.com>
Cc: public-rdf-dawg-comments@w3.org
Message-Id: <AFE2BE21-A560-4B7E-B345-4198696359B7@evilfunhouse.com>
Leigh,

Thanks for the comments (and apologies for the very late response).

On Oct 24, 2009, at 9:27 AM, Leigh Dodds wrote:
> Hi,
> 
> Here are some personal comments on the SPARQL 1.1 Service Description
> Working Draft published on 22/10/2009.
> 
> * Section 3.2.2 sd:Language. It would be useful to clarify the text to
> indicate that the "subset of the SPARQL language" that is being
> described is either SPARQLQuery or SPARQLUpdate. My initial reading
> was that the class was to be used to describe subsets of the SPARQL
> *query* language, but I don't think this is what is intended.

I agree the wording should be improved. Query and Update are the obvious sd:Languages here, but we have talked about the possibility of having other defined subsets (e.g. a "safe" query language subset that couldn't initiate network connections via FROM clauses or basic federated query constructs). While the set of sd:Languages we define is still up in the air, I'll look into clarifying the spec text.


> * Section 3.2.3 sd:Function. This section lists scalar functions,
> aggregate functions and entailment regime. It ought to be updated to
> include an entry for property functions; these features are
> implemented in a number of processors already and are an important
> capability to be able to document. I've written up some notes on
> different forms of SPARQL extension here [1]

While the service description is intended to represent exactly this sort of thing, I'm not sure the service description vocabulary should make direct reference to features that aren't sanctioned by the SPARQL spec. The sd:feature property can be used to point to any feature IRI, including extensions to SPARQL itself, but the only features I'm inclinded to enumerate in the spec are those that are defined by the spec.


> * Section 3.4.7/3.4.8. Using the provided functionality I can refer to
> a dataset that the endpoint exposes, but in the circumstances where an
> endpoint exposes multiple datasets and/or graphs I cannot yet state:
> this one is the default. Ideally there should be a way to indicate
> which of the described datasets (if any) will be used as the default
> of the query or protocol request doesn't include a dataset

The next draft of the spec will include new properties to address this issue. An example dataset description might look like:

[] a sd:Service ;
sd:url <http://example/sparql> ;
sd:defaultDatasetDescription [
  sd:defaultGraph [
	void:statItem [ scovo:dimension void:numberOfTriples ; rdf:value 100 ] ;
  ] ;
  sd:namedGraph [
	sd:named <http://example/graph> ;
	sd:graph [
	  void:statItem [
		scovo:dimension void:numberOfTriples ;
		rdf:value 200
	  ] ;
	] ;
  ] ;
] ;
.


> * One important aspect of SPARQL endpoints is whether they use a fixed
> dataset or allow an arbitrary dataset to be constructed for the
> purposes of the query, e.g. by fetching data from URI specified in the
> FROM/FROM NAMED claused. It would be useful to be able to describe
> this capability in the service description. Simply omitting a dataset
> description in this case it ambiguous (the dataset may still be
> "fixed" but its just not described. One option would be to include an
> additional sub-type of endpoint to allow them to be classified in this
> way.

The next draft will include a feature IRI specifically for indicating the ability to fetch graphs named in the FROM/FROM NAMED clauses.


> * The SPARQL protocol document notes that a SPARQL endpoint may refuse
> to process a request if a dataset is not specified. Perhaps this
> potential behaviour should be documented in the service description,
> e.g. indicating that one of the described datasets must be used in the
> protocol or query

I don't believe the working group has considered this yet, but I'd be happy to discuss it for inclusion in a future draft.


> *  Section 4.1, "Graph Store and SPARQL Query Services" of the SPARQL
> 1.1 Update document notes that the graphs for the update service may
> differ from those offered by the query service. In the present Service
> Description specification I don't see a way to indicate that one or
> more datasets are to be used for the update service while others may
> only be available for query. A simple and common example of this would
> be a dataset which is the union of all stored named graphs in the
> Graph Store. This is a feature of several SPARQL implementations but
> this dataset is unlikely to be available for manipulation through
> SPARQL Update, etc.
> 

Good point. There hasn't been much alignment between the service description and update documents yet, but it's certainly something we'll be looking at.

As for the specific case of a default graph (I assume you meant graph and not dataset?) being the union of all named graphs, this is another feature for which we'll be providing a feature IRI. Knowing a service supports this feature probably won't directly indicate whether the default graph is updateable, but it will be an indication that the graph content depends on other graphs.


> Cheers,
> 
> L.
> 
> [1]. http://www.ldodds.com/blog/2009/10/surveying-and-classifying-sparql-extensions/


thanks,
greg
(on behalf of the SPARQL Working Group)
Received on Tuesday, 12 January 2010 01:47:44 UTC