[css21] URL token grammar doesn't match reality

The CSS2.1 Core Grammar currently specifies that the only way to get a
URL token is with the literal characters "u", "r", and "l"
(case-insensitive.  If you escape any of them, you'll instead get a
FUNCTION token.

This doesn't match reality - IE, FF, and Opera all allow the
characters to be escaped and still invoke the normal URL token
parsing.  Here's a testcase of several things that help distinguish
between the two:
http://software.hixie.ch/utilities/js/live-dom-viewer/saved/1519  IE,
FF, and Opera all return results consistent with always using the
special "unquoted url" production.

I propose we change the Core Grammar as follows:

1. Import the U, R, and L productions from Appendix G.
2. Change the URI token production to:
  {U}{R}{L}\({w}{string}{w}\)
  |{U}{R}{L}\({w}([!#$%&*-\[\]-~]|{nonascii}|{escape})*{w}\)

(We may need to do the same for the leading "u" on the UNICODE-RANGE
token.  I haven't tested to see yet.)

~TJ

Received on Tuesday, 8 May 2012 13:48:38 UTC