Re: [url] Requests for Feedback (was Feedback from TPAC) from Bjoern Hoehrmann on 2015-01-05 (public-ietf-w3c@w3.org from January 2015)

From: Bjoern Hoehrmann <derhoermi@gmx.net>
Date: Mon, 05 Jan 2015 18:16:02 +0100
To: "Martin J. Dürst" <duerst@it.aoyama.ac.jp>
Cc: Sam Ruby <rubys@intertwingly.net>, Mark Nottingham <mnot@mnot.net>, "public-ietf-w3c@w3.org" <public-ietf-w3c@w3.org>, Julian Reschke <julian.reschke@gmx.de>
Message-ID: <90glaat1sil41kr2snhksajm3d35fthh1h@hive.bjoern.hoehrmann.de>

* Martin J. Dürst wrote:
>The URL spec, as far as I understand, allows Unicode as input, so in 
>that respect, it isn't ghettoizing. But it converts all output to ASCII, 
>and so essentially sends a message that Unicode is second-class.
>
>My understanding is that the reason for this is that current browser 
>interfaces are working that way, and I'm not against documenting that, 
>but I'd wish we could get away from that limitation for the general case 
>(i.e. parser results are still Unicode).

There are a couple of conflicting requirements that make that difficult.
If you make an API for resource identifiers, you don't want it to change
behavior when new schemes are introduced; you probably also want that an
input like `example:///ö` is handled the same as `example:///%c3%b6` and
then also avoid turning `data:image/png,...%xx...` into a mix of random
Unicode characters interspersed with %xx escapes that would not round-
trip if decoded. If you want Unicode output, and a data-like scheme is
introduced, you cannot satisfy all requirements.
-- 
Björn Höhrmann · mailto:bjoern@hoehrmann.de · http://bjoern.hoehrmann.de
D-10243 Berlin · PGP Pub. KeyID: 0xA4357E78 · http://www.bjoernsworld.de
 Available for hire in Berlin (early 2015)  · http://www.websitedev.de/

Received on Monday, 5 January 2015 17:16:50 UTC