- From: Bjoern Hoehrmann <derhoermi@gmx.net>
- Date: Mon, 05 Jan 2015 18:16:02 +0100
- To: "Martin J. Dürst" <duerst@it.aoyama.ac.jp>
- Cc: Sam Ruby <rubys@intertwingly.net>, Mark Nottingham <mnot@mnot.net>, "public-ietf-w3c@w3.org" <public-ietf-w3c@w3.org>, Julian Reschke <julian.reschke@gmx.de>
* Martin J. Dürst wrote: >The URL spec, as far as I understand, allows Unicode as input, so in >that respect, it isn't ghettoizing. But it converts all output to ASCII, >and so essentially sends a message that Unicode is second-class. > >My understanding is that the reason for this is that current browser >interfaces are working that way, and I'm not against documenting that, >but I'd wish we could get away from that limitation for the general case >(i.e. parser results are still Unicode). There are a couple of conflicting requirements that make that difficult. If you make an API for resource identifiers, you don't want it to change behavior when new schemes are introduced; you probably also want that an input like `example:///ö` is handled the same as `example:///%c3%b6` and then also avoid turning `data:image/png,...%xx...` into a mix of random Unicode characters interspersed with %xx escapes that would not round- trip if decoded. If you want Unicode output, and a data-like scheme is introduced, you cannot satisfy all requirements. -- Björn Höhrmann · mailto:bjoern@hoehrmann.de · http://bjoern.hoehrmann.de D-10243 Berlin · PGP Pub. KeyID: 0xA4357E78 · http://www.bjoernsworld.de Available for hire in Berlin (early 2015) · http://www.websitedev.de/
Received on Monday, 5 January 2015 17:16:50 UTC