All of the WWW Available **Forever**
Edward Cherlin (cherlin@cauce.org)
Mon, 19 May 1997 10:20:43 -0700
Message-Id: <v03007806afa63bb6549b@[206.245.192.36]>
Date: Mon, 19 May 1997 10:20:43 -0700
To: uri@bunyip.com
From: Edward Cherlin <cherlin@cauce.org>
Subject: All of the WWW Available **Forever**
This suggests a new URL scheme: traditional URL plus date, directed to this
archive. Something similar for Usenet, also, directed to Deja News.
>Subject: All of the WWW Available **Forever**
>To: xanadu@xanadu.com.au
>From: ____Textpert Alert____ <ianf@random.se>
>Mime-Version: 1.0
>Reply-To: xanadu@glasswings.com.au
>Precedence: list
>Date: Mon, 19 May 1997 14:41:14 +0200
>
> True to my name handle, I'd like to alert y'all to the truly
> Xanadudlian mission of the start-up Internet Archive and Alexa
> companies, the former a non-profit effort to continuously
>
> s t o r e ALL OF (unrestricted-access) WWW pages FOREVER ;
>
> the second a commercial outfit developing tools to browse and
> reuse such cumulative/ multi-generation archive contents.
>
> Acc. to their owner Brewster Kahle --formerly of the Thinking
> Machines Corp., and a father of WAIS-- one of the target functions
> of Alexa-derived software is to be a `"reliability service" that
> will resurrect dead links. Give the URL and an approximate date
> to the Archive, and it will dig up the document.'..... rings a
> bell, doesn't it?
>
> The Alexa archives are made of successive sweep-n-suck (BIIIG
> sucks, too) sessions of the entire WWW dataspace resulting in
> consecutive "frozen Webs" stored at one location -- currently
> a warehouse in SF; ultimately in the digital storage facility of
> the US National Archives in Washington, D.C. Treating an entire
> docuverse as a collection of "barts" (or "stamps", I keep mixing
> them up) may sound like a bit of overkill, but whoever said that
> the (yellow brick) road to Xanadu must be straight and narrow?
>
>
>__Ian
>
>
>Based on Paul Bissex' article at:
>______________________________________________
>http://webreview.com/97/05/09/edge/index2.html
>
>> [...] whereas keyword search engines [AltaVista etc]
>> store an index to the Web, the Archive consists of a
>> copy of the Web itself. Kahle estimates the current
>> size of the Web at about two terabytes (that's two
>> million megabytes). Having completed two full sweeps
>> of the Web, the Archive now contains about four
>> terabytes of data. A recent upgrade of the Archive's
>> connection from two T1 lines to a full T3 brings
>> a welcome 15-fold increase in bandwidth, meaning
>> that future Web "snapshots" will be conducted much
>> faster than the first two. With some researchers
>> estimating the average life of a Web page at 75 days,
>> speed matters.
>
--
Edward Cherlin Help outlaw Spam Everything should be made
Vice President http://www.cauce.org as simple as possible,
NewbieNet, Inc. 1000 members and counting __but no simpler__.
http://www.newbie.net/ 17 May 97 Attributed to Albert Einstein