[css21] url() syntax, core tokens vs grammar from Bjoern Hoehrmann on 2010-07-28 (www-style@w3.org from July 2010)

From: Bjoern Hoehrmann <derhoermi@gmx.net>
Date: Wed, 28 Jul 2010 04:57:02 +0200
To: www-style@w3.org
Message-ID: <6j5v46plnqqbbrq6vj8uftfntrgdpl1duk@hive.bjoern.hoehrmann.de>

Hi,

  http://www.w3.org/TR/2009/CR-CSS2-20090908/syndata.html is in conflict
with http://www.w3.org/TR/2009/CR-CSS2-20090908/grammar.html in the de-
finition of the url() syntax, specifically, the CSS 2.1 syntax allows
the "url" identifier to be escaped, while the core lexical scanner does
not consider something a URI token when there are such escapes. This re-
sults in some confusion how to parse it, for instance, what you do with
(and imagine each of these with an escaped "url" also):

  * url("..." /*...*/)
  * url(... <some control character not allowed here> ...)
  * url([)
  * url(... <end of style sheet>

For instance, in the last case, you could argue that this is not a URI
token per the core syntax and not a URI token according to the CSS 2.1
grammar, so you don't treat it as such; or you can argue, since you are
to parse it according to the core rules and recover from missing parens
by implying them, you do treat it as URI token. Similarily, "url([)" is
a proper URI token and you treat it as such, but for "ur\l([)" you treat
the "[" as as beginning of a `'[' S* any* ']'` block as that is the im-
plication of parsing per the core syntax.

The issue would go away if the core and the CSS 2.1 scanner used the
same tokenization rules.

regards,
-- 
Björn Höhrmann · mailto:bjoern@hoehrmann.de · http://bjoern.hoehrmann.de
Am Badedeich 7 · Telefon: +49(0)160/4415681 · http://www.bjoernsworld.de
25899 Dagebüll · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/

Received on Wednesday, 28 July 2010 02:57:35 UTC