- From: John Daggett <jdaggett@mozilla.com>
- Date: Thu, 20 Nov 2014 05:24:21 -0800 (PST)
- To: www-style list <www-style@w3.org>
Tab Atkins wrote: >> I can't say that I *like* this, but that's because I am >> philosophically not a fan of special tokenizer productions that >> only apply in specific grammar contexts -- can anyone think of a >> *practical* problem? It's not any worse than unquoted url() in >> terms of code, it can't change the boundaries of a top-level >> construct, and the only other issue that comes to mind is that >> it'll make it harder to use <unicode-range-token> somewhere else >> in the future. But I don't know that there *are* other uses, so. > > That requires a vastly more complicated change, switching the > Syntax module from being separate tokenizer/parser steps to being > integrated, with a lot more state being thrown around. And it > doesn't help us if we ever want to use <urange> in another > property or context, which I think is plausible. Tab, the first line of your algorithm for handling <urange> sequences is [*]: 1. Skipping the first u token, concatenate the representations of all the tokens in the production together (or, in the case of <dimension-token>s, the representation followed by the unit). Let this be text. Let's not kid ourselves here, that's basically taking the token soup that results from removing the UNICODE-RANGE token and says "take these tokens and start over from scratch". Calling these "separate tokenizer/parser steps" is basically bogus since your algorithm is effectively re-tokenizing the sequence within the parser. It would work just as well to say as part of selector parsing "if you see a unicode-range token, convert it to text and use this algorithm to come up with a selector". Both are hacks of equal standing, you won't be winning any design contests with either. I think if we were actually trying to create an accurate representation of <urange> in a grammar form, it would look something like: <urange> = ['u' | 'U'] '+' [ <hex-value> ['-' <hex value>]? ] | [ <hex-value>? '?'+ ] Here, <hex-value> would be a sequence of hexadecimal digits with the appropriate restrictions on number of digits and value range applied. I realize we don't have a clean way of representing <hex-value> as a sequence of CSS tokens currently and so the need for hacking. The new syntax for <urange> in the Syntax spec now is an ugly change but, meh, we can make it work. John Daggett [*] http://dev.w3.org/csswg/css-syntax/#urange-syntax
Received on Thursday, 20 November 2014 13:24:49 UTC