Re: Gopher+ Considered Harmful

Tim Berners-Lee (timbl@www3.cern.ch)
Fri, 11 Dec 92 15:18:01 +0100


Date: Fri, 11 Dec 92 15:18:01 +0100
From: Tim Berners-Lee <timbl@www3.cern.ch>
Message-Id: <9212111418.AA02698@www3.cern.ch>
To: Guido.van.Rossum@cwi.nl
Subject: Re: Gopher+ Considered Harmful 
Cc: www-talk@nxoc01.cern.ch


>  And I still don't.  I have the feeling that it would be much easier to
>  adapt HTTP to other (non-TCP) transport protocols if the size of an
>  entity is given separately rather than computed from the entity itself
>  (after all this nonsense is only necessary because TCP doesn't have a
>  way to distinguish EOF from a broken connection).  As I understand it
>  your main objection is that under my proposal you will have to
>  construct the necessary headers in a buffer first.  I don't believe
>  that this is that much of a hassle in today's computers -- it
>  shouldn't be more than a couple of kilobytes even in extreme cases,
>  which is peanuts even for a standard PC.

It is not the space to buffer the stuff in the average case which is a problem.

There are extreme cases: Long documents which spew out of format converters
piped into other format converters.  These things wouyld blow the memory of a  
server which we never like to do.

There is the cumulative effect of response times.  Curerntly, almost all the W3  
code is pipelines, so the reponse (click mouse to first character on screen) is  
a function of the round trip delays and any real retrieval time. The moment you  
put a buffer in to count bytes, you have to wait for the first until the last  
is available. In the (frequent) case of many stages being involved in a  
pipeline the response time does not in fact increase much, you just get a lot  
of CPU from processors on the pipe line.  Once you buffer it up, you are using  
CPU from one processor at a time.   You can't start displaying it until you've  
parsed it and you can't parse it until you've read it and you can't read it  
until the server has counted it and he can't even start to count it until all  
the real work has been finished.

You will notice the difference immediately. 


Piping things until EOF is so much faster.  Can TCP really not tell the  
difference between a remote connection close, and a broken connection? :-((
(APIs apart)

>  An issue on which I don't have a strong opinion is whether we should
>  represent line separators as CRLF in the header -- anyone else?
>  


If you are going to be telnet-style, then CRLF it has to be.
My comment in the proposed spec

	http://info.cern.ch/hypertext/WWW/Protocols/HTTP/HTTP2.html
	
was "...In particular, lines should be regarded as terminated by the Line Feed,  
and the preceeding Carriage Return character ignored." under a note on  
"tolerance".

Tim