- From: Seaborne, Andy <andy.seaborne@hp.com>
- Date: Tue, 20 Feb 2007 18:17:08 +0000
- To: SOURIPRIYA.DAS@ORACLE.COM
- CC: public-rdf-dawg@w3.org
Seaborne, Andy wrote: > (text version - not everyone can easily work with doc files) > > SOURIPRIYA.DAS@ORACLE.COM wrote: >> I have attached a few review comments on rq25. Thanks. -- Souri. Thanks for the comments > > I looked at the syntax of GroupGraphPattern (<Ggp>, for short) and its use in > the examples in the Revision 1.13 (Date 2007/02/16, 18:06:24 hrs) version of > SPARQL Query Language for RDF document. Here are my comments: > > Summary: > Based on the current grammar, <Ggp> probably can be expressed as a regular > expression: > > <Ggp> := `{` <Bgp>? (<NonTriplesgp> .? <Bgp>?)* } > > or equivalently (using some self-explanatory notations and with grammar rule > [32] for GroupOrUniongp broken up for using Ggp and UNIONgp) The SPARQL grammar is LL(1) so that it can be used by the widest possible set of compiler tools. LL(1) allows simple tools, more sophisticated LL and hybrid tools, as well as LALR(1) tools to use the grammar. A consequence of this is there sometimes needs to be a technical, intermediate extra parser state - this is GroupOrUnionGraphPattern. Just written straight, it needs a lookahead of 2 for LL. The GroupOrUnionGraphPattern is a rule that means a parser does not have to look over the first {}-group to see if there is a UNION to determine whether a UNION or just a Group has been encountered. Instead it ends an intermediate rule that covers both cases. This is a standard way to reduce to LL(1). > > <Ggp> := `{` <Bgp>? > ( > (<constraint> | <OPTIONALgp> | <UNIONgp> | <GRAPHgp> | <Ggp>) > .? > <Bgp>? > )* > } Changes in progress: 1/ The BasicGraphPattern rule name will change to TriplesBlock because of the WG decision that filters don't break up a BGP anymore. 2/ Filter is not in GraphPatternNotTriples because it does not break up a BGP anymore. Removing the recursion would be good. I've taken your rewrite and put it into the development version of grammar, along with the other changes in progress and get: GroupGraphPattern ::= '{' TriplesBlock? ( ( GraphPatternNotTriples | Filter ) '.'? TriplesBlock? )* '}' I have also removed rcursion for ObjectList and PropertyList. I haven't worked out a way for TriplesBlock without it allowing adjacent DOTs between triples of different subjects (ditto ConstructTriples). Suggestions welcome! The development grammar passes the current set of 192 syntax tests. > Separator vs. Terminator: Based on the current grammar, rules for use (or > non-use) of . are the following: > > o [R1] Mandatory use as a separator between triples in a <Bgp> > o [R2] Must not be used as a terminator for the last triple in a <Bgp> (or as > a separator between the last triple of a <Bgp> and an immediately following > <NonTriplesgp>). This is legal. We follow Turtle/N3 - the final dot of a BGP is optional. TriplesBlock ::= TriplesSameSubject ( '.' TriplesBlock? )? Tests show this: http://www.w3.org/2001/sw/DataAccess/tests/data-r2/syntax-sparql1/syntax-basic-05.rq http://www.w3.org/2001/sw/DataAccess/tests/data-r2/syntax-sparql1/syntax-basic-06.rq http://www.w3.org/2001/sw/DataAccess/tests/data-r2/syntax-sparql1/syntax-struct-03.rq http://www.w3.org/2001/sw/DataAccess/tests/data-r2/syntax-sparql1/syntax-struct-05.rq http://www.w3.org/2001/sw/DataAccess/tests/data-r2/syntax-sparql1/syntax-struct-06.rq > o [R3] Optional use as a terminator for <NonTriplesgp>. > One way to fix the instances of illegal use (enumerated below) probably > would be to allow use of . optionally as a terminator for (the last triple > of) a <Bgp>. These uses aren't illegal as noted. > o I am okay with that workaround. However, for simplicity, my preferred > solution would be to require use of . as a (mandatory) terminator for each > triple (thus requiring a . at the end of each <Bgp> as well unlike the R1 > rule above used currently) and maybe also for each <NonTriplesgp> (thus > requiring, unlike the optional nature of rule R3 above, a . at the end of > each <constraint>, <OPTIONALgp>, etc.). > > Instances of illegal use of . in GroupGraphPatterns used in examples [NOTE: > All of these instances are violations of rule R2 above]: > The example in Sec 2.1 (on page 7) uses . as a terminator. That is in the Turtle data. The query does not have a tailing dot on the BGP (but it would be legal). > The example in Sec 3.2 (on page 11) uses . as a separator between a triple > and a non-triple (FILTER, in this case). > The example in Sec 3.2 (on page 11) uses . as a terminator (for the last > triple). > Both the examples in Sec 5.2 (on page 18) use . as terminator (for the > respective last triples). > The three examples in Sec 5.4 (on page 19) use . as terminator and/or as > separator between a triple and an immediately following non-triple. > Same problem with the examples in Sec 5.5 (on page 19). > Same problem with the query in Sec 6.1 (on page 21). > Same problem with the example in Sec 6.2 (on page 21). The problem exists > inside the <OPTIONALgp> as well. > Same problem with the example in Sec 6.3 (on page 22). > Same problem with the example in Sec 6.4 (on page 22). > Same problem with the example in Sec 8.2.3 (on page 28). > The example in Sec 8.3.3 (on page 30) uses . as terminator for the last > triple in the first <GRAPHgp>. > The example in Sec 8.3.4 (on page 31) uses . as separator between <Bgp> > and <GRAPHgp>. > The example in Sec 10.1 (on page 35) uses . as separator between <Bgp> and > <OPTIONALgp>. > The example in Sec 10.2.3 (on page 38) uses . as terminator for the last > triple in the <Bgp>. > The example in Sec 11 (on page 41, before Sec 11.1) uses . as a separator > between <Bgp> and <constraint>. > Same problem with the examples in Sec 11.4 and Sec 11.6. Thanks for comments and suggestion for removing the recursion, Andy
Received on Tuesday, 20 February 2007 18:17:25 UTC