- From: Nicolas Mailhot <nicolas.mailhot@laposte.net>
- Date: Fri, 21 Mar 2014 14:47:30 +0100
- To: "Julian Reschke" <julian.reschke@gmx.de>
- Cc: "Nicolas Mailhot" <nicolas.mailhot@laposte.net>, "Mark Nottingham" <mnot@mnot.net>, "HTTP Working Group" <ietf-http-wg@w3.org>, "Gabriel Montenegro" <gabriel.montenegro@microsoft.com>
Le Ven 21 mars 2014 14:18, Julian Reschke a écrit : > On 2014-03-21 14:05, Nicolas Mailhot wrote: >> ... >>> Practically, how is a UA supposed to *know* the encoding that was used >>> for the URI *unless' it constructed it itself? (Which is not what >>> browsers do; they only construct the query part). >> >> If the browser constructed the URL it knows damn well what is the >> encoding >> of its address bar and how to convert to UTF-8 > > OK. But that is true only if the URI was constructed by parsing the > address bar. It's not the case when following links in documents (when > try are already percent-escaped). > >> If the browser got the uRL in a web page or feed or whatever all those >> documents are supposed to declare an encoding so they can be >> interpreted >> at all (and there is a default encoding in the spec if they don't) so it >> can use that encoding and convert to utf-8 before sending > > That's only helps when the link wasn't percent-escaped in the first place. I'll give you a big secret: nobody writes in percent-escaped manually if he can avoid it, just like nobody uses html entities. The bulk of percent-escaped urls has been produced by automatons converting human-written plain text that used the document main encoding, so yes I do expect both encodings to match if the automaton was coded properly. Regards, -- Nicolas Mailhot
Received on Friday, 21 March 2014 13:48:16 UTC