W3C home > Mailing lists > Public > public-html@w3.org > June 2008

Re: expected results for URI encoding tests?

From: Philip Taylor <pjt47@cam.ac.uk>
Date: Fri, 27 Jun 2008 15:00:43 +0100
Message-ID: <4864F28B.5010403@cam.ac.uk>
To: Julian Reschke <julian.reschke@gmx.de>
CC: "public-html@w3.org WG" <public-html@w3.org>

Julian Reschke wrote:
> [...]
> But even when the document encoding is percent-escaped, there's still an 
> issue when a character in the input "URL" can not be mapped to the 
> document encoding; it would be nice to have a test case for that (or do 
> we?).

I'm not sure if one of Hixie's tests covers this already, so I just 
tried the same 002.html test case as before but with 

IE6, Opera 9.5, Safari 3.0 go to "results.cgi/%E2%98%B9??" (i.e. replace 
unmappable characters with an ASCII "?").

FF2, FF3 go to "results.cgi/%E2%98%B9?%E2%98%B9".

In particular, FF2/FF3 appear to switch to encoding a component as UTF-8 
if it contains a character that can't be mapped into the normal 
character set. So in FF3:

'/\u017d?\u017d' => '/%C5%BD?%DE'
'/\u017d?\u017d\u2639' => '/%C5%BD?%C5%BD%E2%98%B9'
'/\u017d\u2639?\u017d' => '/%C5%BD%E2%98%B9?%DE'

i.e. the encoding of the query depends on the characters in it.

(I haven't uploaded test cases for this anywhere, since I don't have a 
trivial way to make the results easy to interpret.)

Philip Taylor
Received on Friday, 27 June 2008 14:01:23 UTC

This archive was generated by hypermail 2.4.0 : Saturday, 9 October 2021 18:44:33 UTC