- From: Martin J. Dürst <duerst@it.aoyama.ac.jp>
- Date: Tue, 06 Jan 2015 19:34:46 +0900
- To: Bjoern Hoehrmann <derhoermi@gmx.net>
- CC: Sam Ruby <rubys@intertwingly.net>, Mark Nottingham <mnot@mnot.net>, "public-ietf-w3c@w3.org" <public-ietf-w3c@w3.org>, Julian Reschke <julian.reschke@gmx.de>
On 2015/01/06 02:16, Bjoern Hoehrmann wrote: > * Martin J. Dürst wrote: >> The URL spec, as far as I understand, allows Unicode as input, so in >> that respect, it isn't ghettoizing. But it converts all output to ASCII, >> and so essentially sends a message that Unicode is second-class. >> >> My understanding is that the reason for this is that current browser >> interfaces are working that way, and I'm not against documenting that, >> but I'd wish we could get away from that limitation for the general case >> (i.e. parser results are still Unicode). > > There are a couple of conflicting requirements that make that difficult. > If you make an API for resource identifiers, you don't want it to change > behavior when new schemes are introduced; you probably also want that an > input like `example:///ö` is handled the same as `example:///%c3%b6` and > then also avoid turning `data:image/png,...%xx...` into a mix of random > Unicode characters interspersed with %xx escapes that would not round- > trip if decoded. If you want Unicode output, and a data-like scheme is > introduced, you cannot satisfy all requirements. This is indeed a theoretical problem, but one that in practice rarely shows up and is rather easily dealt with. First, data:-like schemes are few and far between. Second, there's no reason to convert to Unicode sequences of %xx that can't be converted in full. Third, the equivalence between "ö" and "%c3%b6" might be provided at a higher level in the API, because "is handled the same" assumes a universal equivalence function for URIs and IRIs when the specs clearly explain that there is no such thing (see http://tools.ietf.org/html/rfc3986#section-6 and http://tools.ietf.org/html/rfc3987#section-5). Regards, Martin.
Received on Tuesday, 6 January 2015 10:35:21 UTC