[css21] Proposed regex for unicode-range token

Per today's telcon, I need to propose a new regex for the
unicode-range token in 2.1, to bring it in line with the Syntax
definition we agreed on.

For clarity, here's the current regex:

u\+[0-9a-f?]{1,6}(-[0-9a-f]{1,6})?

This properly covers all the sensible unicode-range syntax, but it
also accidentally covers nonsensical ranges like "u+1?3" or
"u+???-500", which can't be interpreted as a range.

Here's a new regex that only covers the syntax we actually want:

(u\+[?]{1,6})|(u\+[0-9a-f]{1}[?]{0,5})|(u\+[0-9a-f]{2}[?]{0,4})|(u\+[0-9a-f]{3}[?]{0,3})|(u\+[0-9a-f]{4}[?]{0,2})|(u\+[0-9a-f]{5}[?]{0,1})|(u\+[0-9a-f]{6})|(u\+[0-9a-f]{1,6}-[0-9a-f]{1,6})

(This regex was contributed by Simon; I was writing a functionally
identical but less clear one earlier.)

Here's a clearer presentation of the regex, if you ignore whitespace:

(u\+[?]{1,6})|
(u\+[0-9a-f]{1}[?]{0,5})|
(u\+[0-9a-f]{2}[?]{0,4})|
(u\+[0-9a-f]{3}[?]{0,3})|
(u\+[0-9a-f]{4}[?]{0,2})|
(u\+[0-9a-f]{5}[?]{0,1})|
(u\+[0-9a-f]{6})|
(u\+[0-9a-f]{1,6}-[0-9a-f]{1,6})

~TJ

Received on Wednesday, 4 September 2013 16:44:02 UTC