- From: Jamie Lokier <jamie@shareable.org>
- Date: Mon, 22 Aug 2005 14:20:08 +0100
- To: Robert Collins <robertc@robertcollins.net>
- Cc: "William A. Rowe, Jr." <wrowe@rowe-clan.net>, Julian Reschke <julian.reschke@gmx.de>, HTTP Working Group <ietf-http-wg@w3.org>
Robert Collins wrote: > > Ack :) The more comprehensive solution of course, HTTP/1.2, > > although I know some have their hearts set on HTTP-NG first. > > I'd be happy with a HTTP/1.1 errata that updates the http:// scheme to > declare it as utf8 before the escape encoding is done. Not reasonable. There are a significant number of HTTP/1.1-compliant servers which work with URLs that are derived from text in other encodings, and there are servers where the encoding depends on the URL (because the server's job is to pass along the URL unmodified to individual resource handlers). Because of that, proxies must continue to work with URIs that contain arbitrary %-escaped sequences, without filtering or changing them (even if they don't represent valid UTF-8), servers must continue to be able to serve documents containing such URIs, and clients must continue to be able to retrieve documents using those URIs. In principle, the escape-encoding represents an application-specific opaque octet stream, and it need not represent "characters" at all. An appropriate place to define UTF-8 as the encoding to use would be in document standards, such as XML and HTML, as this question really is about how to convert character sequences (in documents and user interfaces) that feature non-ASCII characters and purport to be URIs (but aren't really URIs) into well-formed URIs for network operations. The place where it's useful to specify a character encoding are: - How non-ASCII characters in documents in places such as an "href" attribute are converted into proper URIs for HTTP. - How non-ASCII characters in forms are converted into proper URI query parts. (This is covered somewhat already in HTTP 4). - How non-ASCII characters in other parts of a typical client's user interface such as the "location bar", are converted into proper URLs for HTTP document retrieval. -- Jamie
Received on Monday, 22 August 2005 13:20:31 UTC