Two newish technical reports based on an HTTP trace

About a year ago, I gathered a long (90-day) proxy trace of
HTTP requests, including MD5 digests of the response bodies.
(No, you can't get access to this trace, so please don't even ask.)
I've finished a pair of technical reports based on this trace:

    "A trace-based analysis of duplicate suppression in HTTP"
    Compaq Western Research Lab Research Report 99/2, November 1999
    Many HTTP resources (pages, graphics, etc.) are exact
    duplicates of other resources with different URLs. If an HTTP
    cache contains a duplicate of a requested resource, and could
    detect this, it could avoid substantial network costs by
    returning the cached duplicate in place of the requested URL.
    Previous studies have shown that there is substantial
    duplication of content in both HTTP and FTP, and several
    protocols have been proposed to support efficient and safe
    duplicate suppression in HTTP. We use traces covering
    millions of HTTP requests to quantify the potential benefit
    of an HTTP duplicate-suppression extension. In particular, we
    show that the benefits vary depending on content-type, and
    that a small fraction of Web servers account for most of the
    duplicated resources.
	for Postscript & PDF format versions

(2) "Errors in timestamp-based HTTP header values"
    Compaq Western Research Lab Research Report 99/3, December 1999

    Many of the caching mechanism in HTTP, especially in
    HTTP/1.0, depend on header fields that carry absolute
    timestamp values.  Errors in these values could lead to
    undetected cache incoherence, or to excessive cache misses.
    Using an extensive proxy trace, we looked for HTTP responses
    exhibiting several different categories of timestamp-related
    errors.  A significant fraction of these responses have
    detectable errors in timestamp-based header fields.
	for Postscript & PDF format versions


P.S.: The astute observer will notice that there is also a
timestamp error in the date on one of these technical reports; I
was a bit optimistic about when I would finish it, and then it
was easier to issue it back-dated than to revise our report
numbering scheme.

Received on Wednesday, 8 March 2000 15:10:42 UTC