Re: HTTP/1.1 : Chunking

Tena koe

At 09:39 30/01/98 -0800, Roy T. Fielding wrote:
>>As for backward compatibility, that is pretty subjective.  In software
>>development, it is more easy (read more reliable) to implement a simple
>>protocol than one which is backwardly compatible and complex.
>
>In reality, it is easier and more reliable to deploy a protocol that
>is backwards compatible.  HTTP/1.0 was almost as complex as HTTP/1.1 --
>the only big difference is that HTTP/1.0 was unable to accomplish what
>it tried to do, whereas HTTP/1.1 is barely sufficient without making
>incompatible changes.

When I look at what coding I will have to do to implement HTTP/1.1 support
in a caching proxy which also does tunneling and gatewaying, and serving,
given that I need to support HTTP/1.0 and 0.9 as well, there are going to be
so many switches in the code that it will be as if I implemented another
protocol anyway.  I guess I am mainly self-interested in the points I have
raised, but it would be interesting to hear say the reasoning behind
something like multiple possible entities for a single resource (rather than
say a redirection mechanism, which probably more people will deploy).  Files
in a file system are unique.  Machines on the internet are unique. Time is
unique. What is wrong with the concept of a unique addressing scheme?

I do realise that HTTP/1.1 is intended for a raft of different applications,
there are also a raft of protocols being developed for these applications as
well.  I also understand that a large part of the complexity of 1.1 is to do
with interactions between caches, unfortunately that is what I need to
implement.  Also, there are a great number of users who obtain all Internet
access through a proxy server, and this number is growing rapidly.  So these
more complex systems are becoming ever more prevalent.  That means they have
a greater impact on more people, which means software vendors had better get
it right.  It is easier (more likely) to get it right if it is simpler.
Such things as allowing 3 date formats for HTTP/1.1 seems to fly against
this.  Sure it is necessary for HTTP/1.0, but can't we mandate that clients
use a fixed format if they are to be HTTP/1.1 compliant?  That way in the
future, the other two can be removed, and code can be simplified.

>
>Keep in mind that HTTP is intended for many more applications than
>just the one that you are working on today.  Even with all of its
>apparent complexity, it is still possible to write a simple HTTP server
>in just a few hours, and a simple HTTP client in a few days.  The
>complexity is only needed by complex applications, such as caching,
>but failure to account for that complexity results in failed systems.
>

I also see complexity in things like:

Strong vs Weak comparisons.  I wonder what human will ever decide that they
don't mind if people don't get the latest version of their work, because it
didn't change appreciably, or what system administrator is going to flag a
file as having hardly changed much, or what server developer is going to
allow support for this (no mean task). 

Cache Staleness, and warnings.  I wonder what software vendor is not going
to curse every time a user calls up or emails complaining about warning
messages, and wondering what to do about them.  That costs real money.  I
realise in terms of coping with connectivity problems and saving bandwidth,
it can be useful, but what user is not going to force an update anyway.   As
for saving bandwidth, the protocol overhead in HTTP is huge.  And HTTP/1.1
adds even more.  You get about 2k of headers to transfer a 256 byte file.
Analysis of our caches and customer caches shows that the majority of files
cached are under 4 k.  That means header bandwidth accounts for about 50%.
If you want to save bandwidth, look there.

Anyway, sorry for my ranting.  I don't really expect that this will really
change anything.  However, I do have a couple of ideas that could be useful,
and added in easily.

1. Interceptive HTTP caching.

The way this works in WinGate is this.  Clients may be using SOCKS to
communicate directly with a server through a SOCKS firewall.  Our SOCKS
server when it receives a connection request for port 80, or 8080 etc, looks
at the first packet of the client data.  If it sees HTTP, it intercepts the
request, and passes it to the caching web proxy built into WinGate as well.
the important thing to note here is that the client thinks it is talking to
the end server, and does not know it is talking through a proxy.  This flies
against certain requirements in the HTTP/1.1 spec.  The reason we did it
though is obvious enough - sharing a single cache between SOCKS clients.  I
also see this in the future in router products.  Imagine a router that could
intercept and cache HTTP - what a bandwidth saving you would have there.  It
would be real easy to do too.

The addition of the Host: tag in HTTP/1.1 is really good now, because it
allows the proxy to build up a full URL for cache indexing, whereas before
it perhaps only had the IP address of the host, which causes caching
efficiency problems for big sites that serve a single domain name out of
many machines.

A respone tag or something that could tell the client their request had been
intercepted would be useful here.

2. Condensing real-time streams.

We are seeing a couple of clients that use HTTP and TCP flow control to get
realtime audio.  The data rate is controlled by how fast the client chooses
to read the data off the TCP buffers.  It would be useful to be able to
recognise in a proxy when such data was real-time or not, perhaps a major
content-type.  Then if anyone else requested the same resource, they could
be branched in.  This could provide enormous bandwidth savings in large
hierarchical caching proxy architectures.


Anyay, a couple of thoughts.

Cheers

Adrien


>Kia ora,
>
>....Roy
>
----------------------------------------------------------------------------
------
Adrien de Croy - adrien@qbik.com.  Qbik New Zealand Limited, Auckland, New
Zealand
                 See our pages and learn about WinGate at http://www.qbik.com/
----------------------------------------------------------------------------
------

Received on Friday, 30 January 1998 16:02:10 UTC