W3C home > Mailing lists > Public > uri@w3.org > June 2008

Re: Error handling in URIs

From: Dan Connolly <connolly@w3.org>
Date: Fri, 27 Jun 2008 11:10:23 -0500
To: elharo@metalab.unc.edu
Cc: Ian Hickson <ian@hixie.ch>, uri@w3.org
Message-Id: <1214583023.11021.426.camel@pav.lan>

On Fri, 2008-06-27 at 07:47 -0700, Elliotte Harold wrote:
> Ian Hickson wrote:
[...]
> > The second is with IRIs and character encodings other than UTF-8. While 
> > browsers reliably encode non-ASCII characters in the path using UTF-8, 
> > non-ASCII characters in the query component are encoded using the 
> > document's character encoding, and not UTF-8, which is incompatible with 
> > how the IRI spec defines things.
> 
> You mean, for instance, when submitting a form using GET? Interesting. 
> If so that's a flat-out browser bug and should be fixed.

The bug has been reported, and in fact
firefox and IE have support for using UTF-8 in this situation,
but it's not turned on by default.

In Bug 42899 (iri) – IRI support (RFC 3987), Reported: 2000-06-16,
I found:

"Set network.standard-url.encode-utf8 to "true" in about:config. See bug
284474
for why this is not the default."

that refers to:

Bug 284474 – Converting to UTF-8 a url with an unescaped non-ASCII chars in the query part leads to an incompatibility with most server-side programs
https://bugzilla.mozilla.org/show_bug.cgi?id=284474
reported 2005-03-02
RESOLVED FIXED (hmm... how can I find out when it was resolved?)


There are lots of more or less directly related bugs. I'm
swimming around in them, trying to review the arguments...

  Bug 169388 – Handling of non ASCII characters in URLs
  https://bugzilla.mozilla.org/show_bug.cgi?id=169388
  status: NEW
  reported 2002-09-18

  Bug 261929 – Consider sending urls in UTF-8 by default (images/links
with non-ASCII chacters not displayed)
  Reported: 2004-09-28
  RESOLVED FIXED
  https://bugzilla.mozilla.org/show_bug.cgi?id=261929

I also found this article kinda helpful...

  An Introduction to Multilingual Web Addresses
  Richard Ishida, W3C
  first published 2005-01-14. Last substantive update 2008-05-09
  http://www.w3.org/International/articles/idn-and-iri/

-- 
Dan Connolly, W3C http://www.w3.org/People/Connolly/
gpg D3C2 887B 0F92 6005 C541  0875 0F91 96DE 6E52 C29E
Received on Friday, 27 June 2008 16:22:48 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 13 January 2011 12:15:41 GMT