- From: Andrei Polushin <polushin@gmail.com>
- Date: Sun, 20 Jan 2008 03:50:28 +0600
- To: www-style@w3.org
Justin Rogers wrote:
> It makes more sense to deconstruct the HASH token into two separate
> tokens where Token_Hash is equal "#" and you simply paste this token
> onto the desired end part like ident or name.
There is an IDENT token, but there is no NAME token. If you would
introduce
HASHSIGN := "#"
NAME := {nmchar}+
then it leads to a collision with IDENT and DIMENSION-like tokens, so
that it will be impossible to perform the tokenization without knowing
the context, e.g.:
#abc
maybe: HASHSIGN IDENT
or: HASHSIGN NAME
#123
maybe: HASHSIGN NUMBER
or: HASHSIGN NAME
#123.45in
maybe: HASHSIGN LENGTH
or: HASHSIGN NAME '.' NAME
That is, you should either give up on parser+lexer approach, or
implement a context-sensitive lexer.
--
Andrei Polushin
Received on Saturday, 19 January 2008 21:51:04 UTC