Re: expected results for URI encoding tests? from Julian Reschke on 2008-06-27 (public-html@w3.org from June 2008)

From: Julian Reschke <julian.reschke@gmx.de>
Date: Fri, 27 Jun 2008 20:05:02 +0200
To: Dan Connolly <connolly@w3.org>
CC: Philip Taylor <pjt47@cam.ac.uk>, "public-html@w3.org WG" <public-html@w3.org>
Message-ID: <48652BCE.1000306@gmx.de>

Dan Connolly wrote:
> ...
> The present design looks pretty fragile and complex; the cost
> of everybody dealing with it going forward would seem to be
> higher than the cost of breaking the pages that depend on it...
> but maybe not... authors and developers of new stuff can
> avoid the complexity by sticking to utf-8, I suppose. Hmm.
> ...

Well, sort of. Of course it seems to be a good idea anyway to encode 
pages in UTF-8. However, there may be reasons not to (Asian languages 
anyone???), and at least in theory some intermediate could recode the 
page (it's a text/* mime type after all).

Page producers always can avoid problems for the HTML URLs they send by 
using percent-escaped UTF-8, making the "HTML URLs" proper RFC3986 (all 
ASCII) URIs.

It gets really interesting only with form submission, where it's the 
browser that's constructing the query part. I really wish we had a way 
to force the browser to use UTF-8, *no matter* what the page encoding 
is. And no, for now that doesn't need to be a default. An opt-in, such 
as a new attribute on the form element, could be totally sufficient for now.

BR, Julian

Received on Friday, 27 June 2008 18:05:46 UTC