Fwd: Archiving http proxy cache?

From: Johan Hjelm (hjelm@w3.org)
Date: Fri, Jan 15 1999


Message-Id: <4.1.19990115191135.00b3a350@127.0.0.1>
Date: Fri, 15 Jan 1999 19:12:03 +0100
To: www-wca@w3.org
From: Johan Hjelm <hjelm@w3.org>
Subject: Fwd: Archiving http proxy cache?

This is an idea we might want to build on in the WCA.

Johan
>Resent-Date: Fri, 15 Jan 1999 00:28:39 -0500 (EST)
>Date: Fri, 15 Jan 1999 00:28:34 -0500
>From: Gerald Oskoboiny <gerald@w3.org>
>To: w3t-nerd@w3.org
>X-Mailer: Mutt 0.93.2
>Subject: Archiving http proxy cache?
>Resent-From: w3t-tech@w3.org
>X-Mailing-List: <w3t-tech@w3.org> archive/latest/2169
>X-Loop: w3t-tech@w3.org
>Sender: w3t-tech-request@w3.org
>Resent-Sender: w3t-tech-request@w3.org
>
>Anyone have any ideas?
>
>----- Forwarded message from Gerald Oskoboiny <gerald@impressive.net> -----
>
>From: gerald@impressive.net (Gerald Oskoboiny)
>Newsgroups: comp.infosystems.www.servers.unix,comp.infosystems.www.misc
>Subject: Archiving http proxy cache?
>Date: 15 Jan 1999 05:22:58 GMT
>Message-ID: <slrn79tk5u.276.gerald@devo.impressive.net>
>
>I've been archiving my incoming and outgoing e-mail for the past
>6 years or so, and now that disk space is basically free I'd like
>to do the same for my personal HTTP traffic.
>
>Does anyone have ideas on what software/configuration to use for
>something like this?
>
>I installed Squid and gave it a big cache to fill, but it doesn't
>quite do what I want:
>
>  - it stores HTTP response headers and other metadata inside the
>    cached files (so the files are no longer valid GIF or HTML
>    files on their own because there's extra stuff at the top);
>    this data should be stored externally, IMO.
>
>  - it doesn't keep previous revisions of documents, only the
>    one that was most recently-fetched (hmm, I could probably fix
>    this just by replacing the unlinkd program with one that does
>    nothing.)
>
>  - its cache storage scheme makes sense for a general proxy
>    cache system, but for archiving I'd prefer a directory/file
>    structure more like:
>
>        $cache_root/1999/01/15/http/www.w3.org/foo.html
>
>Any ideas? Would I be better off using Apache or Jigsaw for this?
>(as a basis for hacking/customization, I mean; I doubt that
>there's anything that does exactly what I want as-is.)
>
>It would probably be easiest for me to just write a Perl script
>that does what I want and install that as the root document of a
>locally-running Apache httpd, but that would probably slow things
>down too much.
>
>(My environment is Redhat Linux 5.1 with kernel 2.0.34 on a P133 :(
>with 64M RAM and plenty of disk.)
>
>-- 
>Gerald Oskoboiny <gerald@impressive.net>
>http://impressive.net/people/gerald/
>
>----- End forwarded message -----
>
>-- 
>Gerald Oskoboiny <gerald@w3.org>
>http://www.w3.org/Team/Gerald/
>+1 617 905 8011 (mobile)

************************************************************
                     Johan HJELM
            Ericsson RCUR T/K & Cyberlab NY 
         Currently visiting engineer at the W3C
             The World Wide Web Consortium
                     hjelm@w3.org
   http://www.w3.org/People/W3Cpeople.html#Hjelm
    Fax +1-617-258 5999, Phone +1-617-263-9630
   MIT/LCS, 545 Tech. Sq. Cambridge MA 02139 USA 
        opinions are personal, always my own, 
  and not necessarily those of Ericsson or the W3C. 
============================================================