Comments on latest SPARQL 1.1 Working Drafts from Rob Vesse on 2010-02-09 (public-rdf-dawg-comments@w3.org from February 2010)

From: Rob Vesse <rvesse@dotnetrdf.org>
Date: Tue, 9 Feb 2010 14:30:01 -0000
To: <public-rdf-dawg-comments@w3.org>
Message-ID: <003e01caa994$5eeb3fd0$1cc1bf70$@org>
Hi All

Here's my comments on the latest drafts - specifically the Query and the
Service Description drafts.

Aggregates
- The SAMPLE aggregate is definitely essential, it solves an issue I'd come
across in testing with regards to projecting over variables that you are not
grouping by.
- Why is GROUP_CONCAT called GROUP_CONCAT and not simply CONCAT? I think the
former may be unambiguous in its meaning but is there a particular reason
why the latter is not being used - I'm assuming the WG think there may be
some ambiguity?  With regards to what separator to use I'm inclined to say
that a newline or a comma seems more logical for group concatenation - is
there any particular reason why there couldn't be a two argument version of
GROUP_CONCAT that allowed an arbitrary expression to specify the separator
since the XPath string-join function [1] (which GROUP_CONCAT is an alias
for) supports this anyway.

Property Paths
- I agree with some of the previous comments on the list that some of the
features in property paths seem overly complex, e.g. alternatives.  If you
really need to do alternatives isn't it best just to use UNIONs?
- Returning results of a path expression in an ordered way (with regards to
RDF lists) seems at odds with the general evaluation model of SPARQL which
as I understood it was that the results were an unordered multiset up until
you start applying solution modifiers and only actually becomes ordered if
an OrderBy is applied
- Providing lengths of paths would complicate things and I don't think it
should be in the 1.1 spec
- Limiting results of path expressions to being distinct seems logical and
would aid implementation since you can potentially build a list of valid
paths as you evaluate the expression and by checking that you haven't
already found a specific path you can do cycle detection very easily (I may
be wrong here I'm just thinking off the top of my head how I might implement
paths)

Open Issues
- 5: Surely there is nothing you can express in an ASK that you can't with
an EXISTS?
- 14: I think the aggregates defined now are sufficient and people can
provide extensions as per my comments on issue 15
- 15: Extension aggregates should be defined by URIs just as with extension
functions and the individual implementations can then generate appropriate
structures depending on whether the URI indicates an aggregate/expression.
For example I've already defined a few in the function library for my engine
[2] e.g.

PREFIX lfn: <http://www.dotnetrdf.org/leviathan#>

SELECT ?s lfn:all(IsUri(?o)) AS ?AllObjectsAreUris
WHERE
{
  ?s ?p ?o
} GROUP BY ?s

- 35: I think that with the aggregates currently proposed the only ones
which DISTINCT makes sense for are COUNT and possibly GROUP_CONCAT though
I'd rather have it as only valid for COUNT
- 36: I think this should be rejected at the parsing stage - you shouldn't
be able to project an expression to an existing variable
- 39: I don't see too much of an issue with this though this may require
some queries to be rewritten such that projection expressions are evaluated
in such an order that the necessary expressions are evaluated prior to their
value being used
- 41: GROUP BY expression should be permitted

As a general point on the query draft the EBNF in the grammar section is
still the 1.0 EBNF and does not contain the new rules which 1.1 introduces -
though I guess this may be in part due to the rules not being finalised?
Some of the new EBNF is embedded in the course of the text but some of it
seems to have disappeared at the moment.

Looking at the Service Description draft my main concern is that it allows
you to specify that you support some extension functions but not to say
anything about the arguments of those functions.  For example there's no way
to express that an extension function takes 2 arguments both of which must
be xsd:string and gives back an xsd:string
This may be too complicated for the service description to express easily
and I guess you run into issues when you have functions like fn:concat()
which can take variable/unlimited numbers of arguments.  Is the assumption
that a user/their agent will be able to retrieve the description of that
function from somewhere else?

Rob Vesse
dotNetRDF Lead Developer
================================================================
Developer Discussion & Feature Request -
dotnetrdf-develop@lists.sourceforge.net
Bug Reports - dotnetrdf-bugs@lists.sourceforge.net
User Help & Support - dotnetrdf-support@lists.sourceforge.net

Website: http://www.dotnetrdf.org
User Guide: http://www.dotnetrdf.org/content.asp?pageID=User%20Guide
API: http://www.dotnetrdf.org/api/
================================================================


[1] http://www.w3.org/TR/xpath-functions/#func-string-join
[2] http://www.dotnetrdf.org/demos/leviathan/
Received on Tuesday, 9 February 2010 15:01:42 UTC