Jamie Lokier wrote: > Those web servers _far_ predate RFC2616. Whatever guidance goes into > an HTTP URI standard, it must remain backward compatible with what's > widely deployed, which is precisely why the RFCs don't mandate it yet, > even as they suggest further work is needed on it. Right! > The Location header only has that effect in _web browsers_. Sorry? It's for instance used in 3xx (redirect) responses. > There are lots of other programs which use HTTP for which the > "characters" encoded in a URL are irrelevant. > > Increasingly, we may find that non-web-browser HTTP agents see > non-ASCII characters in parts of a document that claim to be URIs, and > must follow them. Or, they see URIs containing %-encoded characters > and need to convert those to presentable text in documents. Yes. This is a common issue in WebDAV. See <http://greenbytes.de/tech/webdav/draft-reschke-webdav-url-constraints-latest.html> (work in progress). > Broadly, the UTF-8-ness affects programs which relate documents > containing non-ASCII characters with URLs. For example, a spider > which indexes pages that happen to contain non-ASCII characters in the > URLs in "href" attributes... those are actually not valid URLs, but > the spider has to make a heuristic decision if it's to follow them. > > Unfortunately, if we mandate that non-ASCII characters found in "href" > URL attributes should be %-escaped as UTF-8 to follow them, we'll find > that this *breaks* some existing deployed sites. Maybe this is for > the best... It's ugly, but probably still the best approach. > ... Best regards, JulianReceived on Tuesday, 23 August 2005 16:05:00 GMT
This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 2 June 2009 18:22:13 GMT