Re: Improve the traffic condition on the Internet

On Wed, 20 Mar 1996, Gao Hong wrote:

> 
> I have an idea about improving the traffic condition on the Internet.  We
> all have the boring experience of waiting for the document to transfer 
> from the Internet and the transfer rate is only about 100byte/s or 
> less.  Why don't we compress those document first and expand them on-line 
> at the client side. 
> 
> I mean we define a compress standard using on the internet.  All the
> documents that are put into the Internet are compressed using the 
> standard, and at the client side the browser expands the documents while 
> downloading the document.  Using such technology, the user can not sense 
> any delay or difference when s/he reads the document just the same as 
> there's no compression. 
> 
> What's about the your opinion to this idea?

The fundamental idea is sound - but the reality is that the vast majority of
traffic *is* already compressed. Additional attempts at compression are
simply not going to provide much improvment in the current situation. 

At www.xmission.com (serving tens of thousands of web pages by a few 
thousand different people), yesterday's traffic top three byte count 
categories (from a total of files transfered of 640,628
and total bytes transfered of 6,083,382,508) were:

Hits    Bytes         File type
======  ==========    =================
207994  2483328017    jpg graphic files
184952  1399457986    html files
181799  1248240540    gif graphic files

The total for the top three categories is 574,745 hits with 5,131,026,543
bytes. The remaining 65,883 hits and 952,355,965 bytes are scattered through
a large variety of file types, most of which appear to be already compressed
types (.mp2, .mpg, .zip, .gz, .Z, etc).  Uncompressed .mov, .avi, .au, 
and .txt files together account for only 1026 hits and 41,365,651 bytes. 

Neither the jpg nor gif files can be effectively compressed more than they
are already, so that pretty much leaves the html files. Assuming a 75%
compression of the html files on the average (a not too unreasonable figure
for English text - does anyone have average compression figures for the JIS
family of text encoding?), the *net* savings would be only around 20% of the
total byte count. This is because html files only represent about 1/4 of the
traffic right now. 

The fact is that page authors are in fact making *excellent* use of
compression in general and files are being kept quite small individually.
The average size of a jpg in the sample was only 11.9 Kbytes, the average
size of gif files was 6.8 Kbytes, the average size of an html file was
7.5 Kbytes.

Finally, at the rate of growth of the net today, a savings of 20% in 
total traffic would just be 'setting back the clock' by about two 
or three months. It is a stopgap, not a solution to the fundamental
scaling problem of everyone accessing resources directly.

A much more effective approach would be the widespread deployment of
hierarchial proxy servers and large scale trans-continental mirroring of
sites to keep most accesses relatively local. Not that it would help that
much with the recent (non) performance of MCI and Sprint in the SF Bay Area.
I often can't reach places less than 6 hops and 10 miles away...

-- 
Benjamin Franz

Received on Wednesday, 20 March 1996 11:24:33 UTC