Predictive local caching

This is a proposed mechanism for preloading a browser's local
disk cache with a group of related web documents, such as a 
tutorial divided into sections. 

At the moment, loading each new section requires a call to the
original server (or cache server), with the attendant connection
overhead. Much would be gained by downloading the entire related
group in a single, compressed, transaction, and unpacking the
resultant compendium directly into the local cache.

The following elements will be needed:

File Format

This specifies the form of the compendium file. MIME multipart mixed
is probably appropriate. The headers for each subpart should 
contain the original URL of the document, together with all
http header information usually sent.

This file format should be browser independent.

MIME type

Used by the browser to recognise an incoming compendium.

	X-WWW-Compendium  (for starters) 

Compression technique

Download time will be reduced if the compendium is compressed using
gzip or similar.

Compendium Compiler

A program to generate a compendium from a web document subtree. It
could be set to omit certain sub-sub trees (eg bits requiring

Compendium Expander

This is the interesting one. A first stab could be a stand alone
program, run by the browser as an external viewer. It would have to
understand the browser's cache format, so would be browser
specific. This has the advantage that no modification to the
browser is required, and the compendium is loaded by the user
clicking a specific link.

A more ambitious project involves incorporating the expander in
the browser. This would be sensible, since the browser has all the
cache management routines to hand. It also opens up the possibility
of automatic compendium loading. 

Compendium HTML tag

This is present in a document which is part of a compendium, and
is used to inform the browser that a compendium exists. The <LINK>
tag in the <HEAD> section would provide a suitable venue. The
tag may also contain the size of the compendium, and the number of
documents is contains. The browser would use this information to
decide when to autoload a compendium. Intermediate cache servers
could also use it to load their databases.

I would be grateful for the remarks of members of this group on
these suggestions. In particular:

Is this the right place to discuss such matters?

Has anything like this been discussed before, and if so, where can I
find details?

Any glaring (or, indeed, subtle) flaws/omissions in the scheme?

Suggestions for modifications.

Details of the format of netscape cache files. I believe they use
Berkley DBM, so details of this would be welcome.

    Mike Gahan 
    Information Systems Division
    University College London

Received on Tuesday, 16 July 1996 07:11:44 UTC