W3C home > Mailing lists > Public > public-html@w3.org > June 2008

Re: expected results for URI encoding tests?

From: Dan Connolly <connolly@w3.org>
Date: Fri, 27 Jun 2008 11:43:18 -0500
To: Philip Taylor <pjt47@cam.ac.uk>
Cc: Julian Reschke <julian.reschke@gmx.de>, "public-html@w3.org WG" <public-html@w3.org>
Message-Id: <1214584998.11021.453.camel@pav.lan>

On Fri, 2008-06-27 at 17:06 +0100, Philip Taylor wrote:
> Julian Reschke wrote:
> > We really should try to define a way that yields UTF-8 based encoding 
> > independently of the document's encoding.
> We also really shouldn't break existing sites that work perfectly well 
> in current web browsers.

That seems to be the position of the firefox developers (largely
influenced by IE/Opera/Webkit...). It's not a matter of implementation
cost; the code is there, but not turned on by default:

"Set network.standard-url.encode-utf8 to "true" in about:config. See bug
for why this is not the default."

that refers to:

Bug 284474 – Converting to UTF-8 a url with an unescaped non-ASCII chars
in the query part leads to an incompatibility with most server-side
reported 2005-03-02

There are lots of related bugs, reported at least as early as 2000,
and many of them are in NEW state.
If there's any way to get a more concise or authoritative opinion
from the Mozilla folks, I'd appreciate it.

I gather that Mozilla's implementation largely follows Microsoft IE,
and that IE's implementation largely pre-dates the IRI specs,
with Opera following closely behind. Webkit seems to have a little
less legacy baggage but faces the same situation with respect
to content/servers.

The present design looks pretty fragile and complex; the cost
of everybody dealing with it going forward would seem to be
higher than the cost of breaking the pages that depend on it...
but maybe not... authors and developers of new stuff can
avoid the complexity by sticking to utf-8, I suppose. Hmm.

Dan Connolly, W3C http://www.w3.org/People/Connolly/
gpg D3C2 887B 0F92 6005 C541  0875 0F91 96DE 6E52 C29E
Received on Friday, 27 June 2008 16:41:44 UTC

This archive was generated by hypermail 2.4.0 : Saturday, 9 October 2021 18:44:33 UTC