- From: Nicolas Mailhot <nicolas.mailhot@laposte.net>
- Date: Fri, 21 Mar 2014 14:05:21 +0100
- To: "Julian Reschke" <julian.reschke@gmx.de>
- Cc: "Nicolas Mailhot" <nicolas.mailhot@laposte.net>, "Mark Nottingham" <mnot@mnot.net>, "HTTP Working Group" <ietf-http-wg@w3.org>, "Gabriel Montenegro" <gabriel.montenegro@microsoft.com>
Le Ven 21 mars 2014 12:34, Julian Reschke a écrit : > On 2014-03-21 12:29, Nicolas Mailhot wrote: >> >> Le Ven 21 mars 2014 12:01, Julian Reschke a écrit : >>> On 2014-03-21 11:55, Nicolas Mailhot wrote: >> >>> That seems to be the same use case as #1. >>> >>> Why don't you just try to UTF-8 decode, and if that works, assume that >>> it indeed is UTF-8? >> >> Really, can't you read the abundant documentation that was written on >> the >> massive FAIL duck typing is for encoding (for example, python-side)? >> Code >> passing unit tests then failing right and left as soon as some new >> encoding combo or text triggering encoding differences injected in the >> system? Piles of piles of partial workarounds till there was complete >> loss >> of understanding how they were all supposed to work in the first place? > > I understand the problems caused by not knowing what encoding something > is in. What I don't understand is how an out-of-band signal helps if you > really can't rely on it being accurate. > > Practically, how is a UA supposed to *know* the encoding that was used > for the URI *unless' it constructed it itself? (Which is not what > browsers do; they only construct the query part). If the browser constructed the URL it knows damn well what is the encoding of its address bar and how to convert to UTF-8 If the browser got the uRL in a web page or feed or whatever all those documents are supposed to declare an encoding so they can be interpreted at all (and there is a default encoding in the spec if they don't) so it can use that encoding and convert to utf-8 before sending If the encoding declared in the document or in the http headers the web site set is wrong things will fail but no more than if the web page author made a typo in its link. And I want them to fail not propagate errors to innocent bystanders. The whole concept of attempting to silently fix problems with heuristics till web site authors assume they can write garbage and it will be autocorrected at the cost of security and reliability, can not work on a large scale. There are too many people willing to exploit the holes the autocorrection heuristics open right and left. People doing mistakes is not an excuse to writing fuzzy specs to avoid laying responsibility and then expect things to work out anyway. That's PHB thinking. Anyone in 2014, who defines an URL container, and think he can avoid specifying the encoding of this container, is in for a world of grief and that won't change whether the http2 spec explicitly fixes this hole or not. And I'd rather have http2 implementors avoid this particular pitfall because the spec is clear on the subject. -- Nicolas Mailhot
Received on Friday, 21 March 2014 13:06:09 UTC