- From: Andy Seaborne <andy.seaborne@epimorphics.com>
- Date: Tue, 08 Mar 2011 18:32:43 +0000
- To: Alex Hall <alexhall@revelytix.com>
- CC: public-rdf-wg <public-rdf-wg@w3.org>
On 08/03/11 17:57, Alex Hall wrote: > On Tue, Mar 8, 2011 at 12:28 PM, Antoine Zimmermann > <antoine.zimmermann@insa-lyon.fr > <mailto:antoine.zimmermann@insa-lyon.fr>> wrote: > > The grammar at http://www.w3.org/2010/01/Turtle/#prod-turtle2-WS has > a token called "PASSED TOKENS" which defines comments in Turtle, but > it cannot be reached from the root "turtleDoc". > It should be included in the <WS> token definition, I guess. > > > I interpret that to mean that comments are recognized as tokens, but > skipped by the lexer (i.e. not passed to the parser). Of course that > assumes an implementation that splits recognition into lexing and > parsing stages -- I'm not aware of other types of recognizers but that > doesn't mean they aren't out there. > > -Alex yes - [[Section 4.2 Comments ... Comments are treated as white space. ]] like SPARQL, it's assumed they are removed at a low level, as tokens are formed. Tools, e.g. javacc, and many other, can skip or hide comments. [[ White space (production ws) is used to separate two tokens which would otherwise be (mis-)recognized as one token. ]] Then the parser itself does not specify whitespace directly, e.g. [6] triples ::= subject predicateObjectList does not say <WS>* after 'subject'. There would be a lot of <WS>* padding and you still have to talk about misrecognized tokens and it would not fit many tool chains. I think "PASSED TOKENS" is a reflection of the tool chain Eric was using as indicated by it's rule name of [-]. Andy
Received on Tuesday, 8 March 2011 18:33:21 UTC