- From: Anne van Kesteren <annevk@opera.com>
- Date: Fri, 20 Apr 2012 11:15:16 +0200
The URL query component for URLs found in HTML (exact set still be to be defined I think) uses the page encoding when the page encoding is not utf-8/utf-16 (then it uses utf-8). E.g. "?€" maps to "?%80" in a windows-1252 encoded page. Currently browsers differ for what happens when the code point cannot be encoded. E.g. "?€" Opera uses "?". Internet Explorer uses "?" (but when the URL hits the network layer, not when you inspect it via script). WebKit uses "&#...;". Gecko encodes it using utf-8. What Gecko does makes the resulting data impossible to interpret. What WebKit does is consistent with form submission. I like it. Also, given that encoding behavior is not exposed besides form submission and URLs, consistently using "&#...;" for code points not represented in legacy encodings makes sense to me. Am I missing something? -- Anne van Kesteren http://annevankesteren.nl/
Received on Friday, 20 April 2012 02:15:16 UTC