W3C home > Mailing lists > Public > www-style@w3.org > September 2013

[css21] Proposed regex for unicode-range token

From: Tab Atkins Jr. <jackalmage@gmail.com>
Date: Wed, 4 Sep 2013 09:43:15 -0700
Message-ID: <CAAWBYDC+=95NAHdPGoioiimo9m3=SA96-5masDcDQYHtyPVTtA@mail.gmail.com>
To: www-style list <www-style@w3.org>
Per today's telcon, I need to propose a new regex for the
unicode-range token in 2.1, to bring it in line with the Syntax
definition we agreed on.

For clarity, here's the current regex:

u\+[0-9a-f?]{1,6}(-[0-9a-f]{1,6})?

This properly covers all the sensible unicode-range syntax, but it
also accidentally covers nonsensical ranges like "u+1?3" or
"u+???-500", which can't be interpreted as a range.

Here's a new regex that only covers the syntax we actually want:

(u\+[?]{1,6})|(u\+[0-9a-f]{1}[?]{0,5})|(u\+[0-9a-f]{2}[?]{0,4})|(u\+[0-9a-f]{3}[?]{0,3})|(u\+[0-9a-f]{4}[?]{0,2})|(u\+[0-9a-f]{5}[?]{0,1})|(u\+[0-9a-f]{6})|(u\+[0-9a-f]{1,6}-[0-9a-f]{1,6})

(This regex was contributed by Simon; I was writing a functionally
identical but less clear one earlier.)

Here's a clearer presentation of the regex, if you ignore whitespace:

(u\+[?]{1,6})|
(u\+[0-9a-f]{1}[?]{0,5})|
(u\+[0-9a-f]{2}[?]{0,4})|
(u\+[0-9a-f]{3}[?]{0,3})|
(u\+[0-9a-f]{4}[?]{0,2})|
(u\+[0-9a-f]{5}[?]{0,1})|
(u\+[0-9a-f]{6})|
(u\+[0-9a-f]{1,6}-[0-9a-f]{1,6})

~TJ
Received on Wednesday, 4 September 2013 16:44:02 UTC

This archive was generated by hypermail 2.4.0 : Friday, 25 March 2022 10:08:34 UTC