- From: fantasai <fantasai.lists@inkedblade.net>
- Date: Tue, 12 Apr 2016 17:27:46 -0400
- To: "Tab Atkins Jr." <jackalmage@gmail.com>, www-style list <www-style@w3.org>
On 04/12/2016 04:37 PM, Tab Atkins Jr. wrote: > History: CSS2.1 defined a special grammar token just for unicode > ranges, which was used in exactly one place: the 'unicode-range' > descriptor of @font-face. This special production caused bugs in > pages, where selectors like `u+a { ... }` were parsed as a > UNICODE-RANGE token, rather than the expected "IDENT(u) DELIM(+) > IDENT(a)", like every other selector of that form was parsed. (This > isn't theoretical - Moz had a bug reported against it for this.) > > When writing the Syntax spec, I tried to fix this by dropping the > unicode-range concept from the tokenizer, and instead handling it as a > complex construct of the existing tokens, like I did with <an+b>. > This kinda worked initially, but was *really* nasty. Since then, we > added scinot to numbers (like 1e3 for 1000), and this *completely > destroyed* my ability to define <urange> cleanly - I can no longer use > the value of numeric tokens, and instead have to rely on the > "representation", which no browser stores or wants to store. > > I want to go ahead and resolve this. I can see three options: > > 1. Keep what I'm currently doing. This requires browsers to hold onto > the string representation of numeric tokens (numbers and dimensions) > at least through initial parsing (longer if they're used in a custom > property). > > 2. Abandon this effort, go back to having a special unicode-range > token. Accept that this is weird and there are stupid side-effects, > like some selectors not working. > > 3. Define a new <urange> syntax that's actually simple to obtain from > the existing tokensĀ¹. Deprecate the old syntax; require UAs to accept > the old syntax in the 'unicode-range' descriptor, but don't define how > they should do so. (Current UAs use context-sensitive retokenizing, I > think - once they realize they're in a unicode-range descriptor, > they'll retokenize the original text according to a special set of > rules.) > > Thoughts? Given unicode-range is already shipping http://caniuse.com/#feat=font-unicode-range I think #3 is a non-starter. I would imagine that reparsing unicode-range tokens in order to make the selectors work would be easier than doing #1, no? Hanging onto unicode-range tokens would be a lot less memory than hanging onto numbers and dimensions, given they're used so rarely. ~fantasai
Received on Tuesday, 12 April 2016 21:28:19 UTC