- From: Alex Hall <alexhall@revelytix.com>
- Date: Mon, 11 Jul 2011 10:39:02 -0400
- To: Mischa Tuffield <mischa.tuffield@garlik.com>
- Cc: RDF WG <public-rdf-wg@w3.org>
- Message-ID: <CAFq2biw-_1PxeDcJznMpgmme5MxHEgN7JLkG9wmTiVj-i5k2mw@mail.gmail.com>
On Sat, Jul 9, 2011 at 6:15 AM, Mischa Tuffield <mischa.tuffield@garlik.com>wrote: > <snip/> > > On 9 Jul 2011, at 01:02, Alex Hall <alexhall@revelytix.com> wrote: > > On Fri, Jul 8, 2011 at 12:29 PM, Mischa Tuffield <<mischa.tuffield@garlik.com> > mischa.tuffield@garlik.com> wrote: > >> <snip/> >> 5. In Section 4.4 - Grammar: there is a distinct lack of whitespacing >> here, I am guessing this is based the current grammar is but a first pass. >> There is an email thread I started on this list which includes feedback from >> a Stefano D'Angelo (parser implementer), I think we should make sure we >> address the issues brought forward there [1]. >> >> > There is a related note from Andy at [1]. Basically, whitespace and > comments are included in the PASSED TOKENS rule, which indicates that > whitespace and comments are allowed as tokens (a.k.a. terminals) anywhere in > the grammar but ignored. This reflects the fact that many tools (javacc, > Antlr, etc) can skip whitespace tokens or emit them on a special hidden > channel. > > Note that section 4.1 does talk some about whitespace. Manually inserting > whitespace tokens everywhere they could possibly appear in the grammar would > be too difficult and would obscure the meaningful parts of the grammar. So > we just say that it's allowed everywhere (outside of terminals) and only > required to disambiguate two terminals that would otherwise be interpreted > as one. > > Note also that the SPARQL grammar [2] handles whitespace in a similar > fashion. > > > I have just gone though SPARQL1.1 grammar and agreed the handling of > whitespace is best left out to not obscure the meaningful parts. FWIW > Section 4.1 is slightly confusing from my point of view perhaps the > following statement in [a] should be expanded upon: > > "White space is significant in tokens IRI_REF<http://dvcs.w3.org/hg/rdf/raw-file/Turtle-FPWD/rdf-turtle/index.html#prod-turtle2-IRI_REF> > and string<http://dvcs.w3.org/hg/rdf/raw-file/Turtle-FPWD/rdf-turtle/index.html#prod-turtle2-String> > ." > I think all this is saying is that whitespace appearing within an IRI or string literal is not ignored as it is in other parts of the grammar, i.e. "foo bar" != "foobar". This does bring up the question of whether the mention of IRI_REF should be dropped here, since whitespace is no longer allowed in IRI's. Also, the IRI_REF production is not displaying correctly in my browser. I see: <IRI_REF> ::= "<" (( [^<>\"{}|^`\\] - [#0000- ] ) | UCHAR )* ">" Note that the class of excluded characters has a lower bound (#0000) but no upper bound. Comparing to the SPARQL grammar, it looks like that part should read [#0000-#0020]. -Alex > > -Alex > > [1] <http://lists.w3.org/Archives/Public/public-rdf-wg/2011Mar/0297.html> > http://lists.w3.org/Archives/Public/public-rdf-wg/2011Mar/0297.html > [2] <http://www.w3.org/TR/sparql11-query/#whitespace> > http://www.w3.org/TR/sparql11-query/#whitespace > > > The link [2] above doesn't resolve in my browser, and I can find any > section entitled whitespace in the document, nevertheless I do prefer > SPARQL's style over the older "ws" heavy turtle submission. > > Regards, > > Mischa > > [a] > http://dvcs.w3.org/hg/rdf/raw-file/Turtle-FPWD/rdf-turtle/index.html#sec-grammar-ws > > > - sent from a tablet thing >
Received on Monday, 11 July 2011 14:39:30 UTC