Comments on SPARQL 1.1 from Rob Vesse on 2009-11-13 (public-rdf-dawg-comments@w3.org from November 2009)

From: Rob Vesse <rvesse@dotnetrdf.org>
Date: Fri, 13 Nov 2009 10:46:45 -0000
To: <public-rdf-dawg-comments@w3.org>
Message-ID: <000d01ca644e$993b7dc0$cbb27940$@org>
Hi all

Here are my comments/questions on the draft.  Some of this is based on
personal opinion and some of this is based on starting to try and implement
some of this stuff in my SPARQL implementation in my latest development
builds of dotNetRDF (early Alpha release .Net RDF library).

Aggregates
- I definitely support the use of a separate HAVING keyword, I agree with
Leigh that it makes clear that the type of constraint is different and helps
SQL programmers make the move to SPARQL by making the syntax familiar

- The proposed grammar has a couple of things which bug me:

  1. It allows for a Having clause without a Group By clause which doesn't
necessarily make sense depending on the constraint given.  I assume that if
you have a Having clause without a Group By then it should act as a Filter
on the result set?
  The only reason I can see for allowing this is if the having clause uses
an aggregate then this makes sense in some circumstances e.g. seeing whether
there are a given number of Triples matching a pattern:

SELECT * WHERE {?s a ?type} HAVING (COUNT(?s) > 10)

  2. What is the intended meaning of grouping by an aggregate since the
grammar permits this?  Should this be permitted at all?

- On the subject of which aggregates to include I think the ones currently
proposed are all useful and I can't think of any obvious missing ones.  With
regards to how MIN and MAX should operate would it be reasonable to suggest
that since SPARQL defines a partial ordering over values that MIN/MAX should
return the minimum/maximum based on that ordering.  This is much easier to
implement than doing some form of type detection or having some complex
algorithm for how they operate over mixed datatypes.  While it does have the
disadvantage or potentially returning different results depending on how
exactly the SPARQL engine orders values I would be happy with this
behaviour.  If people really need type specific minima/maxima then they can
use appropriate FILTERs in their queries or possibly extra aggregates could
be introduced eg. NMIN/NMAX (Numeric minimum/maximum)

Subqueries
- I would appreciate some clearer guidance on variable scoping as I don't
feel comfortable attempting to implement these until I have a better idea of
how this should work.

Projection Expressions
- I like these very much and have implemented these already.  I personally
don't like the idea of a LET keyword, it certainly restricts the ability of
the SPARQL processor to decide in what order it wishes to execute the query.
Plus to my mind it starts to make SPARQL into the equivalent of Transact SQL
(and other vendor specific SQL stored procedure languages) which feels wrong
to me.

Rob Vesse
dotNetRDF Lead Developer
================================================================
Developer Discussion & Feature Request -
dotnetrdf-develop@lists.sourceforge.net
Bug Reports - dotnetrdf-bugs@lists.sourceforge.net
User Help & Support - dotnetrdf-support@lists.sourceforge.net

Website: http://www.dotnetrdf.org
User Guide: http://www.dotnetrdf.org/content.asp?pageID=User%20Guide
API: http://www.dotnetrdf.org/api/
================================================================
Received on Friday, 13 November 2009 12:10:19 UTC