RE: Proxies and incorrect Content-Length from Joris Dobbelsteen on 2000-03-21 (www-talk@w3.org from March to April 2000)

From: Joris Dobbelsteen <j.p.tdobbelsteen@freeler.nl>
Date: Tue, 21 Mar 2000 18:39:11 +0100
To: <www-talk@w3.org>
Message-ID: <000001bf9369$957e3a30$0d0aa8c0@Thuis.local>
This means the proxy should not re-use an existing connection made to a
upstream web server.
Use HTTP/1.1 Connection: close for web server connections and when using
HTTP/1.0 no headers. Now hope the server will close the connection, or wait
for a timeout and close the connection.


What seems is that Microsoft Internet Explorer (5.01 I have) doesn't care
about the content-length in the response. (I send a response from a proxy
with an incorrect content-length).

Also what is a possibility is to check for a string that looks like this:

HTTP/#*.#* ### *<CRLF>

	#      = single digit
	*      = any character sequence (including nothing)
	<CRLF> = CR + LF

however the last thing requires more performance for the proxy server.

You can also ignore incorrect Content-Length headers and return an error
because the response is incorrect for the proxy server.



If you find a solution that seems to work for this problem, can you send it
to me? I'm also working on a proxy, and this problem was not known to me.

The only question is: Why don't the CGI web servers where this bug is known,
use chunked transfer, this is esspecially designed for dynamic web pages,
where the content-length was not known before starting processing the
scripts, e.d.


	- Joris Dobbelsteen

-----Original Message-----
From: Miles Sabin [mailto:msabin@cromwellmedia.co.uk]
Sent: maandag 20 maart 2000 13:35
To: http-wg@hplb.hpl.hp.com
Subject: Proxies and incorrect Content-Length


I'm looking for a brief rundown on best-practice for how non-
caching, limited-buffering, proxies should handle origin server
responses with incorrect Content-Length headers.

As far as I can make out there are are only two cases where
a proxy will be able to _reliably_ detect an incorrect Content-
Length,

  HTTP1.1 origin server with Connection: close
  HTTP1.0 origin server with no Connection: keep-alive

in both cases a proxy can infer a Content-Length overrun
because it expects the connection to be closed at the end of
the response entity. Overruns with persistent connections can't
easily be distinguished from a broken subsequent response, and
underruns can't easily be distinguished from a broken
connection.

Given that overruns are quite common (usually the result of
broken CGIs/SSIs not accounting for the length of non-static
data) I'd quite like to be able to forward such responses.
However I don't want to have to buffer the whole response to
recalculate the CL. I can see a couple of possibilities,

  HTTP1.1 downstream client
    Strip off the response Content-Length and forward with
    chunked transfer encoding.

  HTTP1.0 downstream client
    Strip off the response Content-Length and close the
    connection after the response entity.

Unfortunately the second of these effectively precludes the
use of Keep-Alive on all HTTP1.0 responses: because the proxy
won't be able to determine whether or not there's been an
overrun until the origin-server has run over the end, so *all*
responses have to be presumed to be potential overrunners.

Other options avoid that problem, but look troublesome,

  Truncate the reponse entity
    Dangerous for non text/* types; problematic even for those
    (eg. stripped trailing copyright messages).

  Forward any content overrun, then close the connection.
    Problematic for HTTP1.0 Keep-Alive clients which might
    attempt to interpret the overrun as the headers of a
    subsequest response; technically illegal for an HTTP1.1
    proxy. OTOH, the proxy would be forwarding stuff which is
    no more broken than would have been received had the origin
    server been contacted directly.

Opinions?

Cheers,


Miles

--
Miles Sabin                       Cromwell Media
Internet Systems Architect        5/6 Glenthorne Mews
+44 (0)20 8817 4030               London, W6 0LJ, England
msabin@cromwellmedia.com          http://www.cromwellmedia.com/
Received on Tuesday, 21 March 2000 14:13:16 UTC