Re: Proposed fixed version of N-Triples https://www.w3.org/TR/n-triples/ Section 7 from Andy Seaborne on 2017-06-30 (public-rdf-comments@w3.org from June 2017)

From: Andy Seaborne <andy@apache.org>
Date: Fri, 30 Jun 2017 09:53:10 +0100
To: "Peter F. Patel-Schneider" <pfpschneider@gmail.com>, Eric Prud'hommeaux <eric@w3.org>, public-rdf-comments@w3.org
Message-ID: <0d2077bc-93f0-16e4-966a-54aede69519a@apache.org>

On 30/06/17 01:42, Peter F. Patel-Schneider wrote:
> On 06/29/2017 03:34 PM, Eric Prud'hommeaux wrote:
>> * Andy Seaborne <andy@apache.org> [2017-06-29 21:11+0100]
>>> I think that changing the grammar in this way has disadvantages:
>>>
>>> For larger languages, it adds a lot of clutter.
>>>
>>> It does not reflect the practical aspects of tools.
>>>
>>> Whitespace and comment processing is often done during tokenization and
>>> tokenizers even have special facilities, or common idioms, for doing that.
>>> Having the grammar reflect that helps implementers.
>>
>> strong +1. It is the default behavior of almost every lexer [...] to
>> break on whitespace.
> 
> Not lex, for starters.

The "common idioms" I was referring to include the way the lex-family 
handle this.

The idiom is to recognize whitespace then not generate a token, and not 
to pass it to the rule parser.

It's in the man page, the wikipedia page and the yacc documentation.

flex.1:

[ \t\n]+      /* eat up whitespace */

     Andy

Received on Friday, 30 June 2017 08:53:45 UTC