Re: Comments on latest SPARQL 1.1 Working Drafts from Gregory Williams on 2010-03-22 (public-rdf-dawg-comments@w3.org from March 2010)

From: Gregory Williams <greg@evilfunhouse.com>
Date: Mon, 22 Mar 2010 18:33:13 -0400
To: Rob Vesse <rvesse@dotnetrdf.org>
Cc: <public-rdf-dawg-comments@w3.org>
Message-Id: <B88789C2-906F-48B9-9B06-C6DB871715BD@evilfunhouse.com>
On Feb 9, 2010, at 9:30 AM, Rob Vesse wrote:

> Hi All
> 
> Here's my comments on the latest drafts - specifically the Query and the
> Service Description drafts.

Rob,

Thanks for the comments.


==Aggregates==

There is currently no plan to reduce the set of aggregates, so SAMPLE is highly likely to be included in the subsequent drafts. It is neccesary as SPARQL has no implicit sampling behaviour, unlike SQL.

GROUP_CONCAT is named that way so that there can also be a scalar CONCAT function. GROUP_CONCAT probably should not be conventionally two argument, as that causes some confusion with the semantics of aggregates. There's a possibility of a syntax like GROUP_CONCAT(?x SEPARATOR "\t"), the intention is to add this before the next WD.

==Property Paths==

> I agree with some of the previous comments on the list that some of the features in property paths seem overly complex, e.g. alternatives.  If you really need to do alternatives isn't it best just to use UNIONs?

Property path features can be combined into complex paths. Allowing alternatives in property path makes for more compact expression.

> Returning results of a path expression in an ordered way (with regards to RDF lists) seems at odds with the general evaluation model of SPARQL which as I understood it was that the results were an unordered multiset up until you start applying solution modifiers and only actually becomes ordered if an OrderBy is applied

Order of results from property paths is not guaranteed.

> Providing lengths of paths would complicate things and I don't think it should be in the 1.1 spec

Providing lengths is not currently planned.  This does weaken the usefulness of property paths but, as a time permitting feature, the WG is inclined to leave analysis and specification of including lengths to a later group when more deployed experience is available. The WG believes it has not designed out the possibility - for example, potential syntax forms have been considered to make sure the synatx is not  a barrier to a future WG.

> Limiting results of path expressions to being distinct seems logical and would aid implementation since you can potentially build a list of valid paths as you evaluate the expression and by checking that you haven't already found a specific path you can do cycle detection very easily (I may be wrong here I'm just thinking off the top of my head how I might implement paths)

Thank you for the observation.

== Open Issues ==

> 5: Surely there is nothing you can express in an ASK that you can't with an EXISTS?

Yes - EXISTS behaves like a nested ASK.

> 14: I think the aggregates defined now are sufficient and people can provide extensions as per my comments on issue 15

It is proposed that extension aggregates would be described by a URI as functions are.

> 15: Extension aggregates should be defined by URIs just as with extension functions and the individual implementations can then generate appropriate structures depending on whether the URI indicates an aggregate/expression. For example I've already defined a few in the function library for my engine [2] e.g.
>  
> PREFIX lfn:<http://www.dotnetrdf.org/leviathan#>
> 
> SELECT ?s lfn:all(IsUri(?o)) AS ?AllObjectsAreUris
> WHERE
> {
>    ?s ?p ?o
> } GROUP BY ?s

This is what the group intends to do. Extension aggregates will also be able to take the DISTINCT flag.

> 35: I think that with the aggregates currently proposed the only ones which DISTINCT makes sense for are COUNT and possibly GROUP_CONCAT though I'd rather have it as only valid for COUNT

The group has decided to allow DISTINCT as a flag to all aggregates, as per SQL.

> 36: I think this should be rejected at the parsing stage - you shouldn't be able to project an expression to an existing variable

It can't be enforced by the grammar, but the current text supports this.

> 39: I don't see too much of an issue with this though this may require some queries to be rewritten such that projection expressions are evaluated in such an order that the necessary expressions are evaluated prior to their value being used

The WG has discussed this and does plan to allow a variable to be used later in a SELECT expression list, with clear rules on scoping.

> 41: GROUP BY expression should be permitted

The current text supports this.

==Service Description==

> Looking at the Service Description draft my main concern is that it allows you to specify that you support some extension functions but not to say anything about the arguments of those functions.  For example there's no way to express that an extension function takes 2 arguments both of which must be xsd:string and gives back an xsd:string This may be too complicated for the service description to express easily and I guess you run into issues when you have functions like fn:concat() which can take variable/unlimited numbers of arguments.  Is the assumption that a user/their agent will be able to retrieve the description of that function from somewhere else?

The intention of the Service Description vocabulary is to provide a minimal set of terms that can allow a simple description of a SPARQL endpoint, its dataset(s), and supported features. Importantly, we're not trying to provide a vocabulary with which to describe *all* possible aspects of an endpoint, including specifics of the supported functions (such as argument and return types) or dataset descriptions.

Our expectation is that with the infrastructure of service descriptions shared between endpoints, implementers can start to use/develop vocabularies for describing services in more detail, with consensus hopefully developing around specific features. For example, voiD[1] is likely to be a good way to describe datasets, while SPIN[2] might provide the sort of extension function descriptions you are talking about.

==Grammar==

> As a general point on the query draft the EBNF in the grammar section is still the 1.0 EBNF and does not contain the new rules which 1.1 introduces - though I guess this may be in part due to the rules not being finalised? Some of the new EBNF is embedded in the course of the text but some of it seems to have disappeared at the moment.

The WG intends to produce a single grammar for both query and update languages because they share many grammar rules.  The EBNF in the new features is included to be helpful as indicative of changes that will be made to the final SPARQL 1.1 grammar.


We hope this message addressed your comments.  If it does, please could you can help our comment tracking by replying to this message stating that you are satisfied with this response.

thanks,
Gregory Williams, Steve Harris, Andy Seaborne
on behalf of the SPARQL working group.

[1] http://rdfs.org/ns/void-guide
[2] http://spinrdf.org/sp.html
Received on Monday, 22 March 2010 22:33:45 UTC