W3C home > Mailing lists > Public > public-rdf-wg@w3.org > July 2011

Re: Some comments re: the current Turtle working draft

From: Alex Hall <alexhall@revelytix.com>
Date: Mon, 11 Jul 2011 10:39:02 -0400
Message-ID: <CAFq2biw-_1PxeDcJznMpgmme5MxHEgN7JLkG9wmTiVj-i5k2mw@mail.gmail.com>
To: Mischa Tuffield <mischa.tuffield@garlik.com>
Cc: RDF WG <public-rdf-wg@w3.org>
On Sat, Jul 9, 2011 at 6:15 AM, Mischa Tuffield
<mischa.tuffield@garlik.com>wrote:

> <snip/>
>
> On 9 Jul 2011, at 01:02, Alex Hall <alexhall@revelytix.com> wrote:
>
> On Fri, Jul 8, 2011 at 12:29 PM, Mischa Tuffield <<mischa.tuffield@garlik.com>
> mischa.tuffield@garlik.com> wrote:
>
>> <snip/>
>> 5. In Section 4.4 - Grammar: there is a distinct lack of whitespacing
>> here, I am guessing this is based the current grammar is but a first pass.
>> There is an email thread I started on this list which includes feedback from
>> a Stefano D'Angelo (parser implementer), I think we should make sure we
>> address the issues brought forward there [1].
>>
>>
> There is a related note from Andy at [1].  Basically, whitespace and
> comments are included in the PASSED TOKENS rule, which indicates that
> whitespace and comments are allowed as tokens (a.k.a. terminals) anywhere in
> the grammar but ignored.  This reflects the fact that many tools (javacc,
> Antlr, etc) can skip whitespace tokens or emit them on a special hidden
> channel.
>
> Note that section 4.1 does talk some about whitespace.  Manually inserting
> whitespace tokens everywhere they could possibly appear in the grammar would
> be too difficult and would obscure the meaningful parts of the grammar.  So
> we just say that it's allowed everywhere (outside of terminals) and only
> required to disambiguate two terminals that would otherwise be interpreted
> as one.
>
> Note also that the SPARQL grammar [2] handles whitespace in a similar
> fashion.
>
>
> I have just gone though SPARQL1.1 grammar and agreed the handling of
> whitespace is best left out to not obscure the meaningful parts. FWIW
> Section 4.1 is slightly confusing from my point of view perhaps the
> following statement in [a] should be expanded upon:
>
> "White space is significant in tokens IRI_REF<http://dvcs.w3.org/hg/rdf/raw-file/Turtle-FPWD/rdf-turtle/index.html#prod-turtle2-IRI_REF>
>  and string<http://dvcs.w3.org/hg/rdf/raw-file/Turtle-FPWD/rdf-turtle/index.html#prod-turtle2-String>
> ."
>

I think all this is saying is that whitespace appearing within an IRI or
string literal is not ignored as it is in other parts of the grammar, i.e.
"foo bar" != "foobar".  This does bring up the question of whether the
mention of IRI_REF should be dropped here, since whitespace is no longer
allowed in IRI's.

Also, the IRI_REF production is not displaying correctly in my browser.  I
see:

<IRI_REF>   ::=   "<" (( [^<>\"{}|^`\\] - [#0000- ] ) | UCHAR )* ">"

Note that the class of excluded characters has a lower bound (#0000) but no
upper bound.  Comparing to the SPARQL grammar, it looks like that part
should read [#0000-#0020].

-Alex



>
> -Alex
>
> [1] <http://lists.w3.org/Archives/Public/public-rdf-wg/2011Mar/0297.html>
> http://lists.w3.org/Archives/Public/public-rdf-wg/2011Mar/0297.html
> [2] <http://www.w3.org/TR/sparql11-query/#whitespace>
> http://www.w3.org/TR/sparql11-query/#whitespace
>
>
> The link [2] above doesn't resolve in my browser, and I can find any
> section entitled whitespace in the document, nevertheless I do prefer
> SPARQL's style over the older "ws" heavy turtle submission.
>
> Regards,
>
> Mischa
>
> [a]
> http://dvcs.w3.org/hg/rdf/raw-file/Turtle-FPWD/rdf-turtle/index.html#sec-grammar-ws
>
>
> - sent from a tablet thing
>
Received on Monday, 11 July 2011 14:39:30 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 16:25:44 GMT