Prefetch manifests

Implemented by:
Sergey Nikitin, Ilya Golubtsov, Alexey Shakin
Spec edited by:
Chaals McCathie Nevile

Introduction

Prefetching is a mechanism to enable periodic refreshing of static resources used by web pages, that are held in browser cache. It is useful for sites which use static resources, but often change which resources are in a particular page.

The method used is to offer a list of resources commonly used on a website, so user agents can periodically fetch the list and download and cache the specified resources to speed rendering.

Related work

This work is related to cache control information. Prefetch is explicitly designed to optimise for pages which are not currently being requested.

Use cases

http://news.example.com changes the headline image on the front page every two hours - but since the static image files themselves don't change, the cache-control information never reflects the change. They want to help browsers make the site load faster by pre-fetching the new images. (R1)

The Yowser web browser checks prefetch information for current tabs, a user-defined number of history pages (default: the last 100 pages opened). It also checks for prefetch when an open page has prefetch and a link to another page on the same site. (R1, R2)

http://lots-of-apps.example.com has many pages which use its frameworks example.js, example.css, example.png, and example.ogv. In order to ensure that pages are rendered rapidly when navigating through the website, it uses a prefetch manifest to identify files that should be readily available even if not required on a given page.

The Amnesia browser compares prefetch information to content in its cache. If it has content that was recorded as prefetch information, and is no longer listed, it expels it from cache to save memory.

Requirements

R1: Prefetch information should be accessible via HTTP or HTTPS

R2: It must be possible to specify prefetch information for multiple pages on a single site

Defining prefetch information with prefetch.txt

The prefetch manifest is implemented by means of a resource with a "well-known location", analagous to robots.txt: SITE_ROOT_PATH/prefetch.txt

The file consists of a list of URLs, or comments, one per line. Relative

Example of prefetch.txt:

This would be the body returned by a GET request to http://example.com.prefetch.txt

# prefetch.txt. Version 1.2.34
http://yandex.st/jquery/1.7.1/jquery.min.js
//yandex.st/common/common.js
/1.2.34/css/all.css
1.2.34/js/all.js

Using prefetch information

If an Expires header is received for a resource, user agents should calculate a random delay from that should be added to the expiry time, to help avoid a DDOS effect from a large number of browsers making prefetch requests.

Prefetch requests must not include cookies.

A user agent may record prefetch information for URLs.

Unless otherwise specified, browsers should respect cache-control. Do we need to say that, or is it obvious?

When a user agent opens a URL it should check whether it has prefetch information recorded. If there is no information, it should request /prefetch.txt (resolved relative to the URL) and record the result. If there are resources listed in prefetch.txt which are not in the user agent's cache, it may fetch the resources specified.