- From: L. David Baron <dbaron@dbaron.org>
- Date: Wed, 5 Aug 2009 21:32:46 -0700
- To: W3C Emailing list for WWW Style <www-style@w3.org>
On Tuesday 2009-06-16 12:46 -0700, Zack Weinberg wrote:
> invalid-url1 url\({w}([!#$%&*-~]|{nonascii}|{escape})*{w}
> invalid-url2 url\({w}{invalid}
It's worth noting that this proposal *still* doesn't eliminate all
arbitrary backtracking in the formal tokenizer (although I think it
produces the correct results). In particular, it still requires
that:
url(arbitrarily-long-text f)
be tokenized starting with a FUNCTION token, and then using
parenthesis-matching in the parser.
It's possible this could be fixed using a third invalid-url token,
though, although that would give different results for the case:
url(foo {)
and perhaps also for the case:
url(foo ()
than would doing the parenthesis/brace/bracket matching according to
the parsing rules.
I think I actually prefer using the parsing rules for
parenthesis/bracket/brace matching. One way to represent this
"formally" may be by giving up on representing url() as a single
token, but instead switching the tokenizer into a different state
(what flex calls start conditions) while inside url().
-David
--
L. David Baron http://dbaron.org/
Mozilla Corporation http://www.mozilla.com/
Received on Thursday, 6 August 2009 04:33:22 UTC