Re: two ideas... from Jeffrey Mogul on 1995-11-30 (www-talk@w3.org from November to December 1995)

From: Jeffrey Mogul <mogul@pa.dec.com>
Date: Thu, 30 Nov 95 12:49:20 PST
To: Simon Spero <ses@tipper.oit.unc.edu>
Cc: touch@ISI.EDU, marc@ckm.ucsf.edu, www-talk@www0.cern.ch, www-speed@tipper.oit.unc.edu
Message-Id: <9511302049.AA00397@acetes.pa.dec.com>
    I grabbed a copy of the Touch and Farber paper cited earlier in
    this thread, which seemed to deal with  FTP. This described a
    pre-send selection algorithm of sending everything in the currently
    selected directory. The Boston University system used a simple
    1-level probablistic model to pick the best candidates for
    pre-send, and used fare less extra bandwidth, though with a higher
    probablity of a cache-miss. There's lots of stuff to tune with
    speculation.
    
The model that Venkata Padmanabhan and I had been working on is
a little different from the BU and Touch/Farber models (as far as
I have been able to learn).

We started from these principles:
	(1) The goal of prefetching is to utilize otherwise
	unused bandwidth to reduce perceived latency.
	(1a) Prefetching should never increase perceived latency.

	(2) Only the server has sufficient history or knowledge
	to be able to predict, based on the retrieval of one URL,
	which other URLs are likely to be retrieved by a client.
	
	(3) Only the client (browser) knows what is in its own
	cache, and whether the user has shifted attention.

Therefore, in our approach (which assumes a persistent connection):
	(1) the server keeps a history of per-client fetch
	streams in order to build a Markov model for making
	predictions (this is based on a file system paper
	by Griffioen and Appleton in Summer '94 USENIX).
	I suppose our approach would work just as well with
	the more mechanistic Touch/Farber approach, although
	that could probably be simulated by a more elaborate
	probabilistic model.
	
	(2) after each retrieval, the server consults its
	Markov model to obtain one or more predictions of the
	next retrieval by the same client.  A "prediction"
	includes a URL and a probability.  [It might also
	include a cache validator for the URL?]
	
	(3) The server maintains a threshold probability,
	perhaps statically chosen by an administrator, or
	dynamically chosen based on server load.  It transmits
	predictions to the client if their probability
	exceeds this threshold.  For example, if the predictions
	are:
		/a/b/c	.9
		/a/b/d	.7
		/a/b/e	.1
	and the threshold is .5, then the server tranmits only
	the first two predictions.

	(4) We would modify HTTP to add a new response header
	to carry predictions.  Ideally, these should come AFTER
	the data in the response (to avoid any extra latency),
	but it might be tricky to do that, and it might not matter
	that much.  The header might also carry the size of the
	object, in case this influences the prefetching decision.
	
	(5) The client receives the predictions, and immediately
	weeds out anything that is already valid in its cache.
	It also maintains its own threshold probability, and
	weeds out predictions below that threshold.  If any
	predictions are left, and the user has not yet made
	an explicit request, the browser prefetches the predicted
	URLs.
	
	(6) If the user makes an explicit request while a prefetch
	is happening (except for the prefetched URL!), the browser
	immediately aborts or suspends the prefetch and does what
	the user asked for.

As far as I can tell, ours differs from the other "prefetch"
approaches in that we split up the prediction mechanism
from the control mechanism, since it's not possible to optimize
both in the same place.  The server knows the behavior of clients
in general, but doesn't know the state of a particular client's
cache or the preferences of its user.  The browser knows its
cache state and user preferences, but may never have visited
this server before, so cannot make its own predictions.  And
(ignoring the effects of congestion), since the client has
complete control over the prefetching mechanism, if the user
shifts attention or makes an unpredicted request, the prefetch
can be aborted immediately, so no extra latency is imposed.

We believe that this can be easily added to HTTP 1.x, since it
requires only:
	(a) persistent-connection support, already in 1.1
	(b) a new response header to transmit predictions, e.g.,
		Prediction: <URL> <probability> [<size>] [<validator>]
	(c) an effective way of aborting prefetches
(c) is actually useful with persistent connections in any event.

-Jeff
Received on Thursday, 30 November 1995 16:22:29 UTC