- From: Zack Weinberg <zackw@panix.com>
- Date: Fri, 5 Apr 2013 09:27:33 -0400
- To: Simon Sapin <simon.sapin@exyr.org>
- Cc: "Kang-Hao (Kenny) Lu" <kanghaol@oupeng.com>, "Tab Atkins Jr." <jackalmage@gmail.com>, www-style list <www-style@w3.org>
On Fri, Apr 5, 2013 at 3:24 AM, Simon Sapin <simon.sapin@exyr.org> wrote: > I don’t see much harm either, but the underlying point is about similar > differences that wouldn’t be detectable. For example, what if my parser has > separate INTEGER and NUMBER tokens rather than having a type flag on NUMBER > tokens? What if it represents percentages as DIMENSION tokens with '%' for > the unit, rather than as a separate token? > > As long as tokens/component values are not exposed to the platform, these > are only an implementation details. I do believe that exposing them > eventually (maybe only on variables) is the way to go to enable CSS > polyfills, but that would effectively freeze the tokenizer. > > (This is also a concern for me, as author of a parsing library where CSS > tokens are part of the public API.) Gecko does have a few such divergences. For instance, CDO and CDC are merged. I've also occasionally thought about giving all syntactically-meaningful DELIMs their own token codes, or even perhaps doing the same for all syntactically-meaningful identifiers. It might well make the parser go faster. I don't see any huge difficulty in hiding these internal divergences from a public API that exposed tokens -- you just need a reverse mapping. Of course it's nice to not have to do that. I tend to think that the tokenizer should be considered mostly frozen, but I don't see any harm in adding new "punctuators" (to borrow a term from C) as necessary. An alternative (a la Smalltalk) would be to declare that any two-character sequence of DELIM characters -- that is, ASCII punctuation excluding ,;:()[]{} -- is a single token. That would be future-proof, but we'd have to audit the existing grammar carefully to make sure it doesn't do anything it shouldn't. zw
Received on Friday, 5 April 2013 13:27:58 UTC