W3C home > Mailing lists > Public > public-iri@w3.org > July 2009

hypertext references and the query component

From: Erik van der Poel <erikv@google.com>
Date: Wed, 22 Jul 2009 10:43:43 -0700
Message-ID: <c07a32650907221043h4a34dd8bjec60869a63bafbdd@mail.gmail.com>
To: public-iri@w3.org
Regarding hypertext references and the query component:

http://tools.ietf.org/html/draft-duerst-iri-bis-06#section-7.4

Step 9.3. says:

           Encode the resulting character sequence into a sequence of
           octets as specified by the HRef-charset; any characters which
           cannot be expressed in HRef-charset should be replaced with
           an (ASCII) '?'.

This is indeed what MSIE 8 and Opera 9 have implemented. Firefox 3.5
converts to UTF-8 when any of the characters cannot be expressed in
the href-charset. Safari 4 and Chrome 2 emit the same format that all
browsers do for HTML forms with method GET, e.g. %26%23123%3B which is
the %-encoded &#123;

>From the server's point of view, the server cannot tell whether an
incoming HTTP request is coming from an HTML form or an HTML href, so
it would be good if the browsers chose the same format for both of
them. For this reason, the Safari/Chrome behavior makes sense.

One disadvantage of the MSIE/Opera format (using '?') is that it loses
info. One disadvantage of the Firefox format (UTF-8) is that the
server is not expecting UTF-8.

One disadvantage of the &#NNN; format (at least, how it is implemented
now) is that the ampersand (&) itself is not escaped, so the user
cannot type a literal "&#123;" and have it behave as expected.

My own opinion is that the disadvantages of the MSIE/Opera/Firefox
formats outweigh the disadvantages of the Safari/Chrome format. Also,
Web sites that wish to allow for literal &#NNN; in form submissions
can simply switch to UTF-8.

Erik
Received on Wednesday, 22 July 2009 17:44:22 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 30 April 2012 19:51:54 GMT