- From: Stefan Eissing <stefan.eissing@greenbytes.de>
- Date: Fri, 22 Feb 2002 10:12:29 +0100
- To: Erik Seaberg <erk@flyingcroc.com>
- Cc: dav-dev@lyra.org, w3c-dist-auth@w3.org, Julian Reschke <julian.reschke@gmx.de>
Am Donnerstag den, 21. Februar 2002, um 22:50, schrieb Erik Seaberg: > Julian Reschke <julian.reschke@gmx.de> writes: > [...] >> - Broken clients may use request/destination URIs which are not >> encoded in UTF-8 > > I thought (per RFCs 2396 and 2616) the "http" scheme just specifies a > mapping for each part of the URL from US-ASCII to a sequence of > octets, and the mapping from those octets back to characters (if any!) > is at the whim of the server. %-escaped UTF-8 is *recommended* for > new schemes (and for HTML that embeds URLs containing invalid > non-ASCII characters), but a server that uses Latin-15 should be able > to send Latin-15 in PROPFIND responses without having a client > complain about or mangle perfectly workable URLs just because it can't > display them nicely. What you say is correct from a pure protocol perspective. Nowhere in RFC 2518 or 2616 is anything said about decoded URIs. And RFC 2396 says that there is no such thing as a decoded URI. All true. The problem appears when you put a user interface on top of webdav. Be it a Windows WebFolder or an OS X mounted volume. People just expect nowadays that they can use Umlauts and the Euro sign in filenames. They even expect that they can do that in Windows and Linux and MacOS and will all see the same file on a particular server. As I see it, this requires a standard character encoding for conversion of user input to URIs on the client side, namely UTF-8. For the broken clients, Julian is talking about, it would be good if servers make the effort to detect non-UTF-8 encoding and apply characters conversion (most likely from 8859-1). I've seen Tomcat looking for charset parameters in URIs to detect char encoding. I'm not sure if I really understand what they are trying to do. It looks like clients could use GET /test/folder/%d7%c4;charset=ISO-8859-7 HTTP/1.1 in their requests. Does anyone know more about that? //Stefan
Received on Friday, 22 February 2002 04:13:00 UTC