- From: Simon Sapin <simon.sapin@kozea.fr>
- Date: Thu, 17 May 2012 16:26:34 +0200
- To: Julian Reschke <julian.reschke@gmx.de>
- CC: www-style list <www-style@w3.org>
Le 17/05/2012 12:37, Julian Reschke a écrit : > On 2012-05-17 12:12, Simon Sapin wrote: >> RFC 3986 (the latest on URIs) only uses a subset of ASCII characters. >> Everything else is invalid/illegal, including all characters above U+007F. > > ...because characters above U+007F are not ASCII characters. Yes of course. I only wanted to point out that it is easy for authors to write something that is not valid according to RFC 3986. > Also, my understanding is that HTML5 doesn't make anything valid that is > invalid as IRI. If I read correctly, compared to URIs, IRIs "only" add non-private non-ASCII codepoints to the list of unreserved characters. In HTML5 on the other hand, every codepoint that is not reserved or '%' is unreserved. For example '>' is valid in the later but not in the former. http://tools.ietf.org/html/rfc3987#section-2.2 http://www.w3.org/TR/html5/urls.html#parsing-urls >> For defining the<url> type, both css21 and css3-values have a reference >> to RFC 3986. Do we really want to be that restrictive? In CSS syntax, >> this declaration parses with a valid URI token. Should the URI inside be >> invalid? >> >> list-style-image: url("Hello<世界>.png"); > > What do implementations do with it? Simpler test case: http://dabblet.com/gist/2719200 <div style="background: url('>é')"> In both Firefox 12 and Chrome 18, an HTTP request is sent to %3E%C3%A9 That is, a string that is invalid in RFC 3987 ('>' is not allowed) is accepted as valid and handled according to RFC 3987 (UTF-8 then %-encoding) Other test case: url('%é') is turned to %%C3%A9 without a warning by the browser, but is refused by the server with 400 Bad Request. (Both forms are invalid, either as an IRI or URI.) >> I suggest we relax the syntax and do something like HTML5. Maybe mention >> IRIs and their conversion to URIs. > > I recommend to stick with the relevant specs, such as either URI or IRI. I’m fine with that, as long as it is explicit in a spec. >> 3. Make sure that all Unicode strings are parsable/valid. (I don’t know >> if this is doable *or* a good idea.) > > Making something valid which is invalid "even" in HTML5 doesn't seem > like a good idea to me. Yes indeed. -- Simon Sapin
Received on Thursday, 17 May 2012 14:27:10 UTC