- From: Tab Atkins Jr. <jackalmage@gmail.com>
- Date: Sun, 19 May 2013 01:46:02 -0700
- To: Zack Weinberg <zackw@panix.com>
- Cc: www-style list <www-style@w3.org>
On Sat, May 18, 2013 at 6:27 PM, Zack Weinberg <zackw@panix.com> wrote:
> On 2013-05-17 4:16 PM, Tab Atkins Jr. wrote:
>>
>> On Fri, May 17, 2013 at 11:12 AM, Zack Weinberg <zackw@panix.com> wrote:
>>> * 4. Unicode-range tokens may need a "valid" flag. I need to
>>> cross-check the code in Gecko against the algorithm in this spec
>>> carefully, but the definition of UNICODE-RANGE in CSS2.1 included
>>> several forms that were semantically invalid.
>>
>> The parser in Syntax ended up only accepting valid unicode ranges
>> (except that it does, technically, allow for ranges where the min is
>> higher than the max). This is more restrictive than CSS 2.1, but it
>> only fails to cover things that were invalid in the first place.
>
> I will pay careful attention to this section when I go back through.
Just to be totally clear, I think it's obvious that the definition of
UNICODE-RANGE in 2.1 was simply due to someone being lazy with the
regex:
u\+[0-9a-f?]{1,6}(-[0-9a-f]{1,6})?
This allows nonsensical things like "u+??a0" or "u+00?-500". It could
easily have been written exhaustively to be correct, if a bit long:
u\+((\?{1,6})|([0-9a-f]\?{0,5})|([0-9a-f]{2}\?{0,4})|([0-9a-f]{3}\?{0,3})|([0-9a-f]{4}\?{0,2})|([0-9a-f]{5}\?)|([0-9a-f]{6}))|([0-9a-f]{1,6}-[0-9a-f]{1,6})
Like I said, verbose but easy. (I can't read that text right now to
figure out if I got my parens right. You get my meaning, though.)
Further, CSS2 didn't even define what the invalid syntaxes meant, or
how to treat them. Fonts 3 does, but it merely considers them
invalid, which would still be the case in the current Syntax handling,
since they'd end up as a combination of idents and numbers and delims.
unicode-ranges are already invalid everywhere outside of
'unicode-range', so there's also no risk of behavior change there.
So, Syntax's behavior is just being less lazy with the recognition of
valid tokens, but will, if I'm reasoning correctly, not even have a
theoretical effect on the behavior of current pages.
~TJ
Received on Sunday, 19 May 2013 08:46:52 UTC