Another way to pipeline cache validation....

For revalidating conventional Web pages, it is pretty clear what you
can do to take advantage of HTTP/1.1's pipelining and buffering.  We
certainly outline strategies in our SIGCOMM paper, and the very significant
performance wins that result.

I now use a Java based mail system (Pachyderm), and startup time to validate 
the Java class libraries (at least at the moment; it is a prototype system 
not yet using JAR files), has been a problem.  And, on the surface, short 
of packaging everything up into JAR files, taking advantage of pipelining 
in HTTP/1.1 is hard for Java. A class starts executing, then at run time 
finds it needs another class, goes to revalidate it, executes the next class, 
revalidates, etc.  This effectively serializes cache validation for Java 
applications where classes are not packed in to a single JAR file, and
results in N round trips, one for each class library.  Boring.

In a discussion with Andrew Birrell (DECSRC), one of Pachyderm's authors, 
Andrew had a good idea; it is clearly applicable to things other than Java,
though its use to speed Java startup is obvious.

Andrew's idea is to augment the cache database that a browser keeps on disk 
of things it caches, with a list of the order in which things are accessed 
(at that site, maybe with how soon they were accessed).  So when you go 
see that you need to validate the first class library, it then becomes easy 
to figure out that there are a bunch of things that the Java application 
is going to want/need from that web site.  This makes it trivial to do a 
pipelined cache validation of the set of objects likely to be needed and 
reduce this to one round trip.  The technique could be used for other forms 
of cache validation than Java classes as well, but it is obviously a good 
idea for them. 			
				- Jim Gettys

Forwarded message 1

  • From: Mail Delivery Subsystem <MAILER-DAEMON@mail1.digital.com>
  • Date: Thu, 4 Dec 1997 11:19:32 -0800 (PST)
  • Subject: Returned mail: User unknown
  • To: <jg@pa.dec.com>
  • Message-Id: <199712041919.LAB31475@mail1.digital.com>
The original message was received at Thu, 4 Dec 1997 11:11:28 -0800 (PST)
from pachyderm.pa.dec.com [16.4.16.23]

   ----- The following addresses have delivery notifications -----
<http-wg@cuckoo.hpl.com>  (unrecoverable error)

   ----- Transcript of session follows -----
... while talking to gate.hpl.com.:
>>> RCPT To:<http-wg@cuckoo.hpl.com>
<<< 550 Unable to add recipient.
550 <http-wg@cuckoo.hpl.com>... User unknown

   ----- Original message follows -----

Return-Path: <jg@pa.dec.com>
Received: from pachyderm.pa.dec.com (pachyderm.pa.dec.com [16.4.16.23])
	by mail1.digital.com (8.7.5/UNX 1.5/1.0/WV) with SMTP id LAA01213; 
	Thu, 4 Dec 1997 11:11:28 -0800 (PST)
Received: by pachyderm.pa.dec.com; id AA10741; Thu, 4 Dec 1997 11:11:27 -0800
Date: Thu, 4 Dec 1997 11:11:27 -0800
From: jg@pa.dec.com (Jim Gettys)
Message-Id: <9712041911.AA10741@pachyderm.pa.dec.com>
X-Mailer: Pachyderm (client pachyderm.pa-x.dec.com, user jg)
To: w3c-http@w3.org, http-wg@cuckoo.hpl.com
Subject: Another way to pipeline cache validation....

For revalidating conventional Web pages, it is pretty clear what you
can do to take advantage of HTTP/1.1's pipelining and buffering.  We
certainly outline strategies in our SIGCOMM paper, and the very significant
performance wins that result.

I now use a Java based mail system (Pachyderm), and startup time to validate 
the Java class libraries (at least at the moment; it is a prototype system 
not yet using JAR files), has been a problem.  And, on the surface, short 
of packaging everything up into JAR files, taking advantage of pipelining 
in HTTP/1.1 is hard for Java. A class starts executing, then at run time 
finds it needs another class, goes to revalidate it, executes the next class, 
revalidates, etc.  This effectively serializes cache validation for Java 
applications where classes are not packed in to a single JAR file, and
results in N round trips, one for each class library.  Boring.

In a discussion with Andrew Birrell (DECSRC), one of Pachyderm's authors, 
Andrew had a good idea; it is clearly applicable to things other than Java,
though its use to speed Java startup is obvious.

Andrew's idea is to augment the cache database that a browser keeps on disk 
of things it caches, with a list of the order in which things are accessed 
(at that site, maybe with how soon they were accessed).  So when you go 
see that you need to validate the first class library, it then becomes easy 
to figure out that there are a bunch of things that the Java application 
is going to want/need from that web site.  This makes it trivial to do a 
pipelined cache validation of the set of objects likely to be needed and 
reduce this to one round trip.  The technique could be used for other forms 
of cache validation than Java classes as well, but it is obviously a good 
idea for them. 			
				- Jim Gettys

Received on Thursday, 4 December 1997 11:36:44 UTC