Re: Checking: Turtle grammar / N-Triples Grammar

On 16/05/12 10:38, Peter F. Patel-Schneider wrote:
>
>
> On 05/16/2012 05:30 AM, Andy Seaborne wrote:
>> In case it helps to get Turtle doc to publication, I did some checking
>> of the 2012-05-16T10:00:00+01:00 Turtle and N-Triples grammars because
>> this is last call. I also went over Eric's message of 11/May.
>>
>> Andy
>>
>> ==== 3 Corrections
>
> [...]
>>
>> ==== Presentational
>>
>> == Turtle
>>
> [...]
>> 2/ After reflecting on yesterdays discussion, and noting the greedy
>> tokenization rule, the (WS)+ are not needed.
>>
>> '@base' (WS)+
>>
>> The (WS)+ isn't needed because
>>
>> @baseprefix:<foo>.
>>
>> fails because @baseprefix is a LANGTAG.
>>
> I don't understand what you are getting at here? You appear to be saying
> that this is not valid syntax, but this seems to indicate that the WS is
> needed.

WS is needed to separate tokens.  It does not need to be mentioned and 
conventionally grammars process WS in tokenizing by forgetting about it. 
  Specifying WS between grammar rules is not needed and would very, very 
confusing.


(switching to example @prefixprefix)

Tokenizing is greedy - longest match wins - and is not backtracking.

@prefixprefix matches the LANGTAG token production.

@prefixprefix:<foo>.

is in tokens:

LANGTAG('@prefixprefix') PNAME_NS(':') IRIREF('<foo>')

Therefore it does not match '@prefix' which is what is in teh latest 
grammar nor any other rule (whether considered LL(1) or LALR(1)).

Ideally, it should say "Productions for terminals" after [69s] and some 
words about this.  For example, from another spec:

"""
When tokenizing the input and choosing grammar rules, the longest match 
is chosen.

The SPARQL grammar is LL(1) when the rules with uppercased names are 
used as terminals.
"""

 Andy

>
> [...]
>
> peter
>

Received on Wednesday, 16 May 2012 10:30:11 UTC