Re: Improving If-Modified-Since

On Mon, 14 Aug 1995, Lou Montulli wrote:
>> >Why would a server pumping out bogus last-modified headers act appropriately
>> >to another type of check?  Adding something to the protocol just because
>> >another part is not being used properly seems a bit weird.  If I'm
>> >understanding the problem correctly.
>>
>> The problems currently encountered are mostly caused by the date comparisons
>> done by most HTTP servers when dealing with If-modified-since requests.
>> Most servers assume that as long as the If-modified-since date is equal to
>> or AFTER the current modification date of the document then it is unchanged.
>>
>> This is a problem because people screw up the dates on their files and
>> sometimes give them dates far into the future.  When they fix the
>> dates of the files to correspond to the current date, caches never
>> get updated.
>
>If people "screw up their dates", they're hurting themselves and the people
>who view their pages.  This *isn't* accidental, is it?  How do you
>accidentally set a last-modified to be some future time?  It's like a
>surgeon accidentally removing a right leg instead of a left leg or a
>disgruntled employee accidentally shooting his boss.

It's alot easier than you think.  Moving files via tar or NFS can
carry over dates from other machines that have been set incorrectly.
Even the normal system date can get set incorrectly causing lots
of files to get propogated with wrong dates.

>
>Or maybe it's intentional...
>>
>> In addition to supporting size=SIZE I encourage other server authors to
>> do an _equals_ comparison rather than a greater than or equal comparison
>> of the two dates.
>
>They can't just do a strcmp() since there are a couple date formats it
>needs to deal with.  Also, consider a situation where there are three
>mirrors for a web site, and all three are hidden behind www.host.com and
>selected through shortest-return-trip calculations (like the CERN
>linemode browser does).  Getting a last-modified on one which was later
>than the last-modified on the other, even though they are the same
>document, certainly makes sense.

No it doesn't make sense.  A document only has one real date that it
was last modified and it is specified in absolute time.  There is no
need to ever accept wrong dates, the real modification date should
be propogated to all the servers.  BUT, if a particular server implementation
chooses to do what you have suggested that's fine.  I would just
like to see the standard behaviour for single servers be exact date
comparison rather than older than or equal to.

>
>> >Will the "size" be determined from the Content-length header or the size on
>> >the cache's disk?  If the former, documents with incorrect content-length
>> >headers are essentially uncacheable, as are results from CGI scripts which
>> >generally don't have content-length headers.  If the latter, could there
>> >be encoding problems?
>>
>> The size is determined by taking the current length of the document in the
>> cache.  The content-length of the transfer is discarded, so encodings
>> should have no effect.
>
>Okay, so what should a server do when it doesn't know the actual
>content-length of an object, like a CGI script?  It's totally plausible to
>ask a dynamic object "hey, has any data upon which you would answer this
>question changed since *blah*" in a way that's very quick to answer, but it's
>hard to imagine asking it "hey, will your final output be any size
>different than *blah*" without having it do what it normally does.  So,
>if it's in there, it must be optional for the server to use it.
>
Fine, then that CGI script can ignore the size and return 304.  Having
extra information available does nothing to hinder the success of
cacheing, it can only help.

:lou
-- 
Lou Montulli                 http://www.mcom.com/people/montulli/
       Netscape Communications Corp.

Received on Monday, 14 August 1995 23:40:40 UTC