Re: whitespace in Turtle from Andy Seaborne on 2012-05-15 (public-rdf-wg@w3.org from May 2012)

From: Andy Seaborne <andy.seaborne@epimorphics.com>
Date: Tue, 15 May 2012 16:41:54 +0100
To: Gavin Carothers <gavin@carothers.name>
CC: public-rdf-wg@w3.org
Message-ID: <4FB27942.2070403@epimorphics.com>

>> Pragmatically, WS after @prefix and @base.  Then can have a token type
>> that is "@alpha-alphanumericsanddash".
>
> What you don't like:
>
> [19]  LANGTAG  ::= (BASE | PREFIX | '@' ([a-zA-Z])+ ('-' ([a-zA-Z0-9])+)
>
> ? ;)

(to the casual reader : BASE is '@base' and PREFIX is '@prefix'

Which is ambiguous - as it says:

LANGTAG ::= ('@base' | '@prefix' | '@' ([a-zA-Z])+ ('-' ([a-zA-Z0-9])+)

so the string "@base" matches two ways.

But even if sorted out ... it means a tokenizer may well generate the 
token LANGTAG ... and then:

[5]  base  ::=  BASE IRIREF

does not match as the token is LANGTAG, not BASE.  Oops.

> Yes, I think requiring white space between @prefix and the prefix name
> is a very good idea. More human readable.
>
> So much for last call review this week... sigh...
>
> --Gavin

A simple fix that would be acceptable (to me at least) is:

1/ Remove BASE and PREFIX rules.
2/ Write explicit '@base' and '@prefix'

[5]  base  ::=  '@base' IRIREF
etc

I think this makes it clear what is intended and communicates that the 
LANGTAG tokenization does not apply (this exploits a common feature in 
parser generates that read this as testing for a the string '@base' not 
LANGTAG - I suspect this is what happens in yacker).

 Andy

Received on Tuesday, 15 May 2012 15:42:32 UTC