- From: Zack Weinberg <zackw@panix.com>
- Date: Sat, 18 May 2013 21:27:11 -0400
- To: "Tab Atkins Jr." <jackalmage@gmail.com>
- CC: www-style list <www-style@w3.org>
On 2013-05-17 4:16 PM, Tab Atkins Jr. wrote: > On Fri, May 17, 2013 at 11:12 AM, Zack Weinberg <zackw@panix.com> wrote: >> * Regarding recursive-descent-style tokenization and removal of >> pushback, you were skeptical that this would be easier to read. >> Would you be interested in me attempting to rewrite section 4 with >> those changes, to see how it goes? It would be pretty major so I >> don't want to do it if you're not at least curious whether it would >> be better. > > A small section would suffice. Could you just try rewriting the > number/percentage/dimension parsing? That's probably the most complex > set of interlocking states. OK, I'll try that. >> * throughout: Unlike other Unicode character names, U+FFFD REPLACEMENT >> CHARACTER should *not* be followed by the literal character in >> parens (�). > > Why? Um, I have this vague memory that the Unicode standard somewhere says literal REPLACEMENT CHARACTER isn't supposed to appear in original documents -- it's only for when conversion processes throw up their hands -- but I can't find the text I'm remembering and it's possible there never was any such statement. Anyway it's not that important. >> * 4. Unicode-range tokens may need a "valid" flag. I need to >> cross-check the code in Gecko against the algorithm in this spec >> carefully, but the definition of UNICODE-RANGE in CSS2.1 included >> several forms that were semantically invalid. > > The parser in Syntax ended up only accepting valid unicode ranges > (except that it does, technically, allow for ranges where the min is > higher than the max). This is more restrictive than CSS 2.1, but it > only fails to cover things that were invalid in the first place. I will pay careful attention to this section when I go back through. zw
Received on Sunday, 19 May 2013 01:27:40 UTC