- From: Andy Seaborne <andy@apache.org>
- Date: Fri, 30 Jun 2017 09:53:10 +0100
- To: "Peter F. Patel-Schneider" <pfpschneider@gmail.com>, Eric Prud'hommeaux <eric@w3.org>, public-rdf-comments@w3.org
On 30/06/17 01:42, Peter F. Patel-Schneider wrote: > On 06/29/2017 03:34 PM, Eric Prud'hommeaux wrote: >> * Andy Seaborne <andy@apache.org> [2017-06-29 21:11+0100] >>> I think that changing the grammar in this way has disadvantages: >>> >>> For larger languages, it adds a lot of clutter. >>> >>> It does not reflect the practical aspects of tools. >>> >>> Whitespace and comment processing is often done during tokenization and >>> tokenizers even have special facilities, or common idioms, for doing that. >>> Having the grammar reflect that helps implementers. >> >> strong +1. It is the default behavior of almost every lexer [...] to >> break on whitespace. > > Not lex, for starters. The "common idioms" I was referring to include the way the lex-family handle this. The idiom is to recognize whitespace then not generate a token, and not to pass it to the rule parser. It's in the man page, the wikipedia page and the yacc documentation. flex.1: [ \t\n]+ /* eat up whitespace */ Andy
Received on Friday, 30 June 2017 08:53:45 UTC