W3C home > Mailing lists > Public > whatwg@whatwg.org > April 2012

[whatwg] URL query component

From: Anne van Kesteren <annevk@opera.com>
Date: Fri, 20 Apr 2012 11:15:16 +0200
Message-ID: <op.wc13zqxp64w2qv@annevk-macbookpro.local>
The URL query component for URLs found in HTML (exact set still be to be  
defined I think) uses the page encoding when the page encoding is not  
utf-8/utf-16 (then it uses utf-8).

E.g. "?&euro;" maps to "?%80" in a windows-1252 encoded page.

Currently browsers differ for what happens when the code point cannot be  
encoded. E.g. "?&euro;"

Opera uses "?". Internet Explorer uses "?" (but when the URL hits the  
network layer, not when you inspect it via script). WebKit uses "&#...;".  
Gecko encodes it using utf-8.

What Gecko does makes the resulting data impossible to interpret.

What WebKit does is consistent with form submission. I like it.


Also, given that encoding behavior is not exposed besides form submission  
and URLs, consistently using "&#...;" for code points not represented in  
legacy encodings makes sense to me. Am I missing something?


-- 
Anne van Kesteren
http://annevankesteren.nl/
Received on Friday, 20 April 2012 02:15:16 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 30 January 2013 18:48:07 GMT