[CSS2.1] syntax: url() and unexpected end of (line / style sheet)

I have some questions of interpretation, and a suggestion for refinement, regarding the interaction of the URI token and the "unexpected end of" rules in section 4.7:

1) "User agents must close all constructs (nonexhaustive list) at the end of the style sheet."  Because URI is a single token rather than a grammar production, it is not clear whether it counts as a "construct" for this rule.  For example, should

  <style>#foo { background-image: url("picture.png</style>

be treated as valid and identical to

  <style>#foo { background-image: url("picture.png") }</style> ?

2) "User agents must close strings upon reaching the end of a line, but then drop the construct in which the string was found."  A URI token is not a string, even if it uses the quoted form.  Should this rule apply anyway, and if so, which forms should it apply to? Consider eg.

  <style>#foo { background-image: url(data:image/png;base64,
iVBORw0KGgoAAAANSUhEUgAAAAQAAAAEAQAAAACBiqPTAAAADklEQVQI12NI
YJgAhAkAB4gB4Ry+pcoAAAAASUVORK5CYII=) }</style>

which seems like it might occur in real life.

3) Most of the above concerns would be resolved naturally if URI were a grammar production rather than a token.  Specifically: in 4.1.1, replace 

  URI   url\({w}{string}{w}\)
        |url\({w}([!#$%&*-~]|{nonascii}|{escape})*{w}\)

with

  URI-KEYWORD   "url("
  URI-LITERAL  ([!#$%&*-~]|{nonascii}|{escape})*

and replace

  any : [ ... | URI | ... ] S* ;

with

  any : [ ... | uri | ... ] S* ;
  uri : URI-KEYWORD S* [ STRING | URI-LITERAL ] S* ')' ;

Make similar changes to Appendix G.  [It would be even simpler to allow URI-LITERAL as an "any" production in the core syntax, and treat "url(" as a FUNCTION, but I suspect that would have undesirable side effects, since URI-LITERAL can match so much.]

This change makes the answer to (1) be that the closure rule clearly does apply to url(...) constructs.  (2) is a little more subtle: the quoted form is now just a STRING, so clearly the rule applies, but in the unquoted form, the URI-LITERAL ends at the newline.  If there is more text on a subsequent line before the close paren, the "Malformed declarations" rule triggers, and the effect is the same.

Tangential question 1: Instead of defining COMMENT as a separate terminal token which is then invisible in the grammar, why isn't S defined as ({w}|{comment}), where {comment} has the same definition as the current COMMENT?  This would simplify the core grammar and surrounding text, with no effect on the parsing.

Tangential question 2: Is there any |value|, either in CSS2.1 or any proposed CSS3 module, whose syntax contains square brackets?  Without one I cannot think of a way to test end-of-style-sheet processing for open square brackets.

zw

Received on Wednesday, 14 May 2008 15:49:04 UTC