W3C home > Mailing lists > Public > ietf-http-wg@w3.org > April to June 2007

Re: protocol support for intercepting proxies

From: Adrien de Croy <adrien@qbik.com>
Date: Tue, 19 Jun 2007 12:48:00 +1200
Message-ID: <467727C0.30800@qbik.com>
To: Henrik Nordstrom <henrik@henriknordstrom.net>
CC: HTTP Working Group <ietf-http-wg@w3.org>



Henrik Nordstrom wrote:
> mån 2007-06-18 klockan 18:24 +1200 skrev Adrien de Croy:
>   
> I hereby call you to look into the problem of why interception is
> needed, and to try to come up with ideas to address that problem. In the
> long run a proper solution to that problem will benefit you and everyone
> else.
>
>   
I can start with why we implemented an intercepting proxy.  Apart from 
commercial / competitive issues that is.

Basically the main thing we identified was that customers wanted it.  
Feedback was pretty explicit on this.

Why they wanted it, I guess is because proxy auto-detect or proxy 
operation wasn't doing it for them. 

Proxy auto-detect
------------------
There are plenty of our customers that use proxy auto-detect and are 
quite happy with it.

However, proxy auto detect can require some configuration.  Originally 
proxy auto-detect worked by doing a DNS lookup for the name "WPAD".  
WinGate's DNS server would respond to this with its own IP address.  The 
client would then make an HTTP request for a file called wpad.dat.  We 
would auto-generate this and send it to the client.  So, basically 
customers got proxy auto-detect for free.  I presume other vendors 
adopted a similar approach.

then someone at MS decided DHCP would be a better option for discovering 
proxy config, and option 252 was created whereby the DHCP client issues 
a request for option 252 as part of a DHCP_INFORM message.  DHCP_INFORM 
was a late addition to the DHCP spec, also option 252 is still listed by 
the IANA as "private".  So there are a bunch of DHCP servers out there 
that don't support DHCP_INFORM nor option 252.

So, we have some browsers that use DNS (which they can control, but 
which seems to be being deprecated), and others that rely on their OS 
platform using a private DHCP option that is not formally specified.  
Some client platforms don't provide a mechanism where applications (such 
as a browser) can obtain info from DHCP clients or servers (i.e. 
influence DHCP requests, and access response data).  Most notable OS 
that springs to mind is most if not all versions of windows.

finally, enabling or disabling proxy auto-detection is under user 
control.  Heaven forbid!

I'm not aware of any browsers that do a sanity check on the values that 
get returned from a WPAD DNS lookup, or the URL from DHCP option 252 
either.  We just presume that the DNS and DHCP server responses are 
trustworthy.

So in short, IMHO proxy auto detect is fraught with problems.

Proxy operation
----------------
There are zillions of applications out there that use HTTP for whatever 
purpose.  Many of these are cobbled together.  Many do not consider the 
existence of proxies at all, or can't do proxy auth, or any sort of HTTP 
auth.  Many do not present any UI, but operate as background service 
processes.  Thankfully with APIs like MS INETAPI on Windows, developers 
now have an easier way to write such things, and they gain support for 
things like chunking, auth etc.  But there are still a bunch of broken 
apps out there.

Couple this with the sys admin that still wants to force everything 
through the proxy anyway, and your only option is connection 
interception, where the proxy pretends to behave like a server (right 
down to the header level).

Many of these apps are very old, and aren't being maintained, but are 
still in use and relied on.

This isn't an argument for changes to the protocol actually since 
proposed changes would break all these apps.

> Trying to solve this in HTTP is short-sighted and doomed for failure
>   
Actually I see the configuration of HTTP proxy as being an HTTP protocol 
configuration element, and therefore I don't see why it shouldn't be 
solved in HTTP.  It's not like we're setting a parameter unrelated to 
HTTP.  Pushing the config out to other services multiplies the number of 
applications that need to be maintained.  A sys admin doesn't need to 
keep just their proxy and clients up to date, but also their DHCP 
servers and other servers.  If these are all built into a hardware box, 
there aren't many options.

You might laugh, but one of our biggest support issues used to be people 
configuring their proxy config to use port 21 for FTP proxy in IE.  They 
"knew" that port 21 was the port to be used for FTP.  In the end, we had 
to write an HTTP handler in WinGate's FTP proxy to tell them to change 
their config when it detected an HTTP request for an FTP URL.

There are lots of people out there with a little knowledge, enough to 
make it dangerous to get them to do any sort of manual configuration.

If we could assume that all HTTP agents could work through a proxy, and 
do proxy auth, we should be looking at a more foolproof mechanism for 
proxy auto configuration.  The form that takes has many possibilities, 
you don't like the one proposed so far, that I can live with.  But there 
are potentially others that would solve concerns, and achieve the goals 
of minimum impact on other systems.  The fewer the links in the chain 
IMO the better.  So firstly I think such a mechanism should be

1. under network not individual control (so the network can set the 
requirement to use the proxy, not the client, and it can be managed 
centrally).
2. not reliant on DHCP.  Why must we force everyone on the planet to use 
DHCP just to get an auto proxy config.  Why must we force all OS vendors 
to open up their DHCP clients so that applications can obtain DHCP 
configuration data.

So maybe DNS SRV records are the way to go here.  Some customers lock 
down even DNS though, since the proxy does the DNS lookup for normal 
proxy operations.  Also most OS provided DNS resolver implementations 
(i.e. winsock) only support A record lookups, so unless you want browser 
authors to write their own DNS resolvers (and deal with the 
configuration issues around that), they won't have much luck on many 
OSes even getting a SRV record.

the one common factor that all browsers have is HTTP support.

In the end, the best solutions may be platform-specific, i.e. Active 
Directory policy based, or extensions to network logons.  But I'm 
struggling :)


Adrien


> imho.
>
> Regards
> Henrik
>   

-- 
Adrien de Croy - WinGate Proxy Server - http://www.wingate.com
Received on Tuesday, 19 June 2007 00:47:49 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Friday, 27 April 2012 06:50:10 GMT