Re: net load for title pre-fetch

> I would like to explore the objections to pre-fetch as a way of
> repairing nonsense link text.

Pre-fetch of TITLE is also important for missing ALT of IMG used in A.

> Will the browsers be pre-fetching everything all the time, or
> only on user mode select and heuristic determination of junk
> text?

In both case, IMG with no ALT and "Click Here" A, it's got to be a
user decision. For IMG, I think it's simple switch more: "enable
lookup of missing ALT", for A, it could be more detailed with
heuristic pre-fetch or simple on demand fetching of the target title.

> 	Least net bandwidth is that user exercises explicit "try harder"
> 	function when encountering an under-explained link, and
> 	at this point the browser creates a document citation by
> 	fetching the HTTP HEAD, that plus the first 1K bytes, or
> 	whatever the commercial indexing sites do.  We can even prototype
> 	by vectoring query to search cite tied to this specific
> 	URL [ER WG play item].

yes, I think Henrik's minibot (http://www.w3.org/Robot/) could be used
to determine the optimal number of bytes to pre-fetch from the target
HTML page to get to the TITLE string.

A little explanation here: HTTP HEAD does *not* return the HTML HEAD
section, but the server resource "HEAD", which is really meta
information about the resource, be it a gif file or an html file, or
anything else.

In order to fetch the HTML TITLE (which contains meaningfull
information about the page id most of the time), one has to use what's
called byte-range HTTP request, where a specific number of bytes of
the remote document is retrieved, and just that.

The problem is that there is no rule that guarantees that the TITLE is
always in the first 250bytes, or later. So some experiment needs to
occur to identify this "Average Position of TITLE in HTML" function.

Received on Thursday, 13 August 1998 08:05:26 UTC