Syntax modifications

We do not have to pin it down yet but concrete syntax matters in expressing
testcases.  It is helpful to have a syntax that can express the range of
testcases and then evolves as the draft evolves.

In the initial strawman, there was no graph nesting : graph patterns resided
in clause level only so there was only a restricted form of pattern
combination.  The requirements on the language (particularly
3.13/Disjunction but also 3.6/Optional Match ifnested optionals are allowed)
suggests two or more graph patterns can be combined. Enforcing only top
level disjunction is a burden on the application writer.  Similarly,
optional nested matches would need syntax for graph patterns and ways to
combine such patterns.


Proposal (this is a loose description for comment, not a formal definition):

1/ Graph "patterns" are grouped by {}

2/ A pattern is a list (set) of elements, interpretted as a
conjunction of elements.

3/ Elements are:
   + patterns
   + triples, no parenthesises, with trailing dot to terminate/separate
     (Not allowing N3 style ; and , for the moment).
   + Constraints, relaxing the separation between
     triple pattern and constrinats but retaining
     the familiar mathematical syntax.
     Leaves open whether 
   + optional sub patterns

4/ Graph patterns can be combined with OR (and AND)

Other:

5/ Prefixes don't have to be defined last, but can occur before use.
    The only reason for making them first is because it is nice to have
    SELECT/CONSTRUCT/DESCRIBE/ASK first

6/ Constraints: mix with triple patterns, with or without outer ().

I have mocked it up in javacc and have parsed the examples below.  The
amount of change in the parser is fairly small because the tokenizer didn't
change - it was just a matter of writing grammar rules.  The impact on the
prototype execution engine will be wider, mainly due to the "OR".  (What I'm
not clear on is the whole (non-syntax) issue of more complex queries and
mapping to SQL technologies.)

Examples:

SELECT * WHERE { ?x ?y ?z }
SELECT * WHERE { ?x ?y ?z . }
   Trailing . is optional
   WHERE is actually unnecessary for a grammar.

SELECT ?name ?mbox
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
WHERE  { ?x  foaf:name  ?name .
         ?x  foaf:mbox  ?mbox }

SELECT ?name ?mbox
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
WHERE  { ?x foaf:name  ?name .
         OPTIONAL { ?x  foaf:mbox  ?mbox } }

Could use [] for optional:
SELECT ?name ?mbox ?shoe
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
WHERE  { ?x foaf:name  ?name .
         [ ?x  foaf:mbox  ?mbox .
           ?x  foaf:shoeSize  ?shoe ]
       }

Depending on what the outermost grammar production is (element vs pattern),
we require outer {} or not for things like OR:

// Outer rule is "element"
SELECT * WHERE { ?x :p ?z } OR { ?x :q ?z }

// But this is a bit confusing
SELECT * WHERE { ?x :p ?z . { ?a :a ?x } OR { ?a :a ?x } }

// Outer rulke is "pattern"
{} at all times
SELECT * WHERE { { ?x :p ?z } OR { ?x :q ?z } }

SELECT * WHERE { ?x :p ?z .
                 { { ?a :a ?x } OR { ?a :a ?x } }
               }

SELECT * WHERE { ?x :p ?z . ?z < 42 }
SELECT * WHERE { ?x :p ?z . ?z < 42 . ?a :q ?z }

With javacc, lookahead gloablly 1 (but locally 2 for optional things like
trailing dots and allowing commas in various places), I found this works.

We could mandate outer () on expressions:

SELECT * WHERE { ?x :p ?z . (?z < 42) }

Personally, I prefer leaving layout to the application writer, ie. don't
mandate outer ().

It is possible to write strange queries like "OR {}", "WHERE OPTIONAL {}"
but I suggest we keep a fully general grammar for now and leave aside
whether to ban any non-queries in the grammar until have a full set of
semantics for the query language.  We can then decide on grammar
enhancements to make things clearer.

     Andy

Received on Monday, 9 August 2004 15:31:58 UTC