- From: Julian Reschke <julian.reschke@gmx.de>
- Date: Fri, 17 Jan 2014 12:02:56 +0100
- To: Nicolas Mailhot <nicolas.mailhot@laposte.net>
- CC: Gabriel Montenegro <gabriel.montenegro@microsoft.com>, Zhong Yu <zhong.j.yu@gmail.com>, "ietf-http-wg@w3.org" <ietf-http-wg@w3.org>, Osama Mazahir <osamam@microsoft.com>, Dave Thaler <dthaler@microsoft.com>, Mike Bishop <michael.bishop@microsoft.com>, Matthew Cox <macox@microsoft.com>
On 2014-01-17 11:55, Nicolas Mailhot wrote: > > Le Ven 17 janvier 2014 11:28, Julian Reschke a écrit : >> On 2014-01-17 11:18, Nicolas Mailhot wrote: >>> >>> Le Jeu 16 janvier 2014 22:32, Julian Reschke a écrit : >>> >>>> A proxy does not need to normalize. Full stop. There is no issue here, >>> >>> A security proxy does need to normalize. Full stop. Otherwise malware >>> can >>> trivially bypass security blocks by fuzzing encoding enough the proxy >>> does >>> not realize anymore the block needs to be applied. >> >> Are you talking about normalization beyond removing unneeded >> percent-escapes? > > I'm taking about the very common case when a botnet or malware stain > signature is an URL fragment it tries to communicate with on random zombie > hosts on the Internet. It is very common to configure proxy gateways to > block any access to an url that includes this fragment as first level > defence while more accurate and complete cleanup measures are > investigated. > > (malware is the worst case, sometimes it's just misbehaving browser > plugins or other web clients that need blocking to keep the network > operational) > > Obviously that only works if the gateway can recognize the URL fragment > without being confused by encoding games. So the gateway does need a > reliable way to map byte chains to the text signature (and there is a text > signature because the app writer did use text stings and not random > constants in his code). Unspecified text encoding conventions in URLs make > reliability go away. > > Again, I would like http/2 to specify that URLs are transported as UTF-8 > text in http2 metadata (ideally not %-escaped), with the endpoints being > responsible to converting their local representation to this form before > emission, or baring that > 1. add encoding info somewhere > 2. require the web client and server to fill this info. > > But I really would prefer if the wire representation was unambiguous and > encoding conversions pushed to endpoints. That's the model python people > settled on after years of failing to make the "push everything as chain of > bytes, whatever needs text will manage to convert by itself" work. And > http nodes are way less flexible than a python program. > > Regards, Nicolas, please provide a concrete example. Best regards, Julian
Received on Friday, 17 January 2014 11:03:59 UTC