- From: Simon Sapin <simon.sapin@kozea.fr>
- Date: Thu, 17 May 2012 12:12:21 +0200
- To: www-style list <www-style@w3.org>
Hi, There are multiple definitions of what is a valid URL/URI/IRI: RFC 3986 (the latest on URIs) only uses a subset of ASCII characters. Everything else is invalid/illegal, including all characters above U+007F. IRIs (RFC 3897) extend the grammar to allow most non-ASCII Unicode characters, and defines an how to turn an IRI into an URI (in short: UTF-8 then %-encode) HTML5 (chapter 2.6: URLs) goes even further and allows all characters from U+0 to U+10FFFF although it has a convoluted way of saying it, and some string can still be invalid. For defining the <url> type, both css21 and css3-values have a reference to RFC 3986. Do we really want to be that restrictive? In CSS syntax, this declaration parses with a valid URI token. Should the URI inside be invalid? list-style-image: url("Hello <世界>.png"); I suggest we relax the syntax and do something like HTML5. Maybe mention IRIs and their conversion to URIs. Wherever the limit for validity ends up at, what should happen to invalid URIs? The options I can think of are: 1. Make the value and thus the declaration/rule invalid. The cascade does its usual fallback. Just like only some HASH tokens are valid hexadecimal <color> values, only some URI tokens would be valid <url> values. 2. Have them resolve to an invalid URI that always fails to be fetched. As with an HTTP 404 error, other fallbacks occur (list-style-type is used instead of list-style-image, ...) 3. Make sure that all Unicode strings are parsable/valid. (I don’t know if this is doable *or* a good idea.) Regards, -- Simon Sapin
Received on Thursday, 17 May 2012 10:12:57 UTC