Re: Improving If-Modified-Since from Chuck Shotton on 1995-08-16 (ietf-http-wg@w3.org from July to September 1995)

From: Chuck Shotton <cshotton@biap.com>
Date: Wed, 16 Aug 1995 09:08:38 -0500
To: Lou Montulli <montulli@mozilla.com>
Cc: Carlos Horowicz <carlos@patora.mrec.ar>, http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com
Message-Id: <v02120d19ac57a7ce5be9@[198.64.246.22]>

At 8:07 PM 8/15/95, Lou Montulli wrote:

>> The point I was trying to make is that size is meaningless between any two
>> machines when text files are being compared. CR/LF on a Windows box may be
>> stored as LF only on a Unix host, or CR on a Mac may be converted into
>> CR/LF for transmission. This means that the original file on the server is
>> a physically different size than the file transmitted to the client/proxy,
>> and that the cached file size on the proxy may be an entirely different
>> size than the content-length.
>>
>> File size is essentially useless as a mechanism for determining whether or
>> not a cached file is the correct version.
>
>Your facts don't support your conclusion.  If linefeed conversion is
>performed by the server then it should be done consistantly, you can
>therefore always compute the eventual size of the object by parsing the
>file.  If a proxy modifies the file in any way then it needs to remember
>it's original size.

Lou, why are you forcing this computation on the server? The whole problem
of corrupted or stale caches is a CLIENT problem and the computation should
happen there. Why should a server be forced to read and translate every
byte of a file, just so it can calculate the content-length for a IMS
request from a client that is trying to use file size to determine file
"sameness"? This is extremely burdensome on the server and shouldn't be the
server's job.

Here is an alternative to this whole "size" thing. Let's start with an
assumption. You may not agree with it, but my experience shows otherwise.
Let's assume that for any client/server pair, the two machines store the
same text file with different sizes because of differences in EOL
conventions. That means that the ONLY common size they can have is the
content-length that the server reports. Since this requires action on the
server to compute, it isn't an acceptable mechanism. SO, why not do the
following?

When a server sends a file, the client should make sure that it received at
least content-length bytes of data. If it doesn't receive the complete
file, it can still display the file, but shouldn't cache it. If the client
DOES receive the required number of bytes, it should compute a checksum for
the file and cache the contents. Future requests for the file can simply
use the existing IMS syntax to retrieve new versions based on modification
dates. If dates match, the client can verify that the file in the local
cache is uncorrupted simply by recalculating and comparing the checksums.
All the work for this is done by the client, no server modifications are
required, and no wasted CPU cycles are spent on the server calculating
content-length values for a file that won't be transmitted.

What in this proposal doesn't meet the initial requirements for the
proposed "SIZE" addition it the IMS header? You are still able to confirm
that you have the current version, and you are also able to detect
truncated and/or corrupted files on the client side. And, it doesn't break
the installed server base.

--_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-
Chuck Shotton                               StarNine Technologies, Inc.
chuck@starnine.com                             http://www.starnine.com/
cshotton@biap.com                                  http://www.biap.com/
                 "Shut up and eat your vegetables!"

Received on Wednesday, 16 August 1995 07:11:36 UTC