- From: Julian Reschke <julian.reschke@gmx.de>
- Date: Tue, 08 Jul 2008 10:03:12 +0200
- To: Martin Duerst <duerst@it.aoyama.ac.jp>
- CC: Justin James <j_james@mindspring.com>, 'Ian Hickson' <ian@hixie.ch>, 'Sam Ruby' <rubys@us.ibm.com>, 'HTTP Working Group' <ietf-http-wg@w3.org>, public-html@w3.org
Martin Duerst wrote: > ... > It may or may not need such a special case. The truth is that some years > ago (less than 10), virtually all existing non-ASCII path information > in (U/I)RIs had to be interpreted in the encoding of the containing page. > This has changed, because people started to pick up on the idea of IRIs, > more and more systems used UTF-8 on the server side, and at least some > people understood that using the encoding of the containing page > made it impossible to treat such identifiers free-standing. Also, a > fallback for paths in legacy encodings is still availible (and was always > available): %-encoding. > > As long as query URIs are interpreted based on the encoding of the > containing page, they will stay useless without that context. I.e. > they cannot (without further pain) be put into bookmark lists, they > cannot be sent in email, and so on. The only sensible way to make > this possible is to do the same as for the path part, namely use > UTF-8 for the IRI->URI conversion. Freestanding (U/I)RIs with > query parts may be less important than freestanding (U/I)RIs > without query parts, but still, they are often convenient. > However, they won't work if implemented the way HTML5 is currently > describing them. Also, same as for path parts, a fallback for query > parts in legacy encodings is still availible (and was always > available): %-encoding. > > In summary, there are cases where things changed to the better > in the last few years, and there are cases where some solutions > make the Web work better than others. > ... Note that HTML5 documents that carry aren't encoded in UTF-8 (or UTF-16) and which carry non-ASCII query parameters are currently non-conformant. (I personally don't think it makes a big difference in practice as HTML5 makes normatively defines their handling, so people will rely on that anyway). >> That can be done in a separate spec, defining a mapping from "HTTP URL" to IRI reference, and then letting the default URI/IRI rules apply. > > I'm very much confused by "HTTP URL". In case that's the term that HTML5 > currently uses, it should use a different one, to avoid confusion. Actually, I wanted to say "HTML URL" (URL as used in HTML5). HTML5 really uses just the term "URL". BR, Julian
Received on Tuesday, 8 July 2008 08:03:58 UTC