First attempt at a grammar for SPARQL/query 1.1

My ideal (and I have not had a chance to talk to Steve yet) is that we have a grammar in the FPWD but given how much there is to go into the grammar and the time available, this may be a stretch objective too far.

Here is first attempt to make a grammar for SPARQL/Query 1.1.

   http://jena.hpl.hp.com/~afs/sparql-1.1.html


We can choose to include some version of the grammar, or not, nearer the time depending on the overall level of comfort with it.

It includes:

+ GROUP BY/HAVING [17], [18], [19], [20]
+ SELECT expressions [8]
+ SubSelect [7], [43]
+ aggregate functions are COUNT, SUM, MIN, MAX, AVG, STDDEV (although what some of them mean in all cases is a separate issue). [104]
+ COALESCE and IF [100]
+ EXISTS, NOT EXISTS, as graph patterns [52], [53] and in explicit FILTERS [102] [103].
    UNSAID is a synonym for NOT EXISTS
+ SPARQL/Update/Lang
+ Property paths [45] and [67] then around [70] for the paths themselves.

The features do evaluate, well, sort of -  MIN and MAX are numeric only until we decide what to do about mixed values for example).  The implementing code is only in ARQ SVN.  Testing is minimal.

Rules [49]/[50]/[51] are missing.  They are some features we are not considering so I have edited them out manually until I can change the produce workflow to know about SPARQL 1.1 (they would get called from [46] - see arq.html if you really want to know).  If anything we are not considering remains, please just ignore it, it's an editing error.

It includes SPARQL/Update (it's a single grammar - the pattern parts are shared).  Not sure that's a good idea or not.

It's a bit scrappy at them moment: could be clearer rules and how the rules are placed and written in some cases.

----

One minor issue (I say "minor" because it's a point design decision and it's good all the other parts seem to work out).

In the design pages we have some different proposals for the syntax of select expressions:  without mandating comma (which would break backward compatibility), one case to remember is ?x-?y: Is that (?x-?y) or "?x" followed by "-?y"?  This is why expressions get bracketed a lot in SPARQL.

>> Design:Aggregate

     ( (Var | AggregateFunc '(' Var ') AS' Var)+ | '*' )

AS is required

SELECT ?a ?b count(?foo) AS ?z

But I don't think the requirement is necessary.

>> Design:Project_Expression

     (( Var | PrimaryExpression "AS" Var ))+ | "*" )

AS is required, single variables can not be renamed without additional ().

Example:

SELECT ?a ?b (?x+?y) AS ?z ?var (?a) AS ?z1 fn:concat(?firstName, " ", ?lastName) AS ?name

But not SELECT ?a AS ?z

>> Design:SubSelect

    ( Var | AggExpression | BuiltInCall | FunctionCall | ( '(' Expression ( 'AS' Var )? ')' ) )+

Example:

SELECT ?a count(*) (?x+?y) my:function(?a) ?var
SELECT (?x+?y AS ?z) (?q AS ?r) (fn:concat(?firstName, " ", ?lastName) AS ?name)

(the system invents a name for an expression if needed when there is no AS)

----

The grammar implements the last one from the SubSelect design (Anzo, ARQ).  The second case is Virtuoso (and Algae2?) syntax.  Can't find documentation for other SPARQL systems online just at the moment.

The main issue is I see is that none of them allow "*" and also include a computed expression which I would have thought would be useful (ran out of time on this one).

  SELECT * (fn:concat(?firstName, " ", ?lastName) AS ?name) 


Other than that, it's mainly value judgement about appearance that decides, specially between adjacent separate terms in the SELECT clause. (rename as ?x, followed by a plain project of ?y) 

The second one allows 

...(expr) AS ?x ?y ....

The third one groups with 

...(expr AS ?x) ?y ....

but could be considered artificial.

 Andy

--------------------------------------------
  Hewlett-Packard Limited
  Registered Office: Cain Road, Bracknell, Berks RG12 1HN
  Registered No: 690597 England

Received on Friday, 28 August 2009 17:23:58 UTC