- From: Justin Chapweske <justin@chapweske.com>
- Date: Mon, 31 Mar 2003 12:52:05 -0600
- To: Andre John Mas <ajmas@newtradetech.com>
- Cc: www-talk@w3.org
The only thing that I don't like about this is that normal HTTP mirroring is very insecure. Our work with the "Content-Addressable Web" uses secure checksums and some HTTP extensions to provide an alternate way of solving the mirror problem. Our paper on "HTTP Extensions for a Content-Addressable Web" can be found at (http://open-content.net/specs/draft-jchapweske-caw-03.html). You may also be interested in its companion specification, "The Tree Hash EXchange format (THEX)" at (http://open-content.net/specs/draft-jchapweske-thex-02.html). We also have a very basic XML-RPC protocol for lease-based mirror advertisement at (http://open-content.net/specs/). Also, there is a functioning Content-Addressable Web header proxy that you can feel free to play around with. Its currently used for the Open Content Network and can be used as follows: bash$ HEAD http://gw1.open-content.net:8080/gateway/head?uri=http://etree01.archive.org/etree/moe1997-03-28dnk.shnf/moe1997-03-28d1/moe1997-03-28d1t06.shn 200 OK Date: Sat, 18 Jan 2003 00:15:05 GMT Accept-Ranges: bytes Server: TornadoGateway/1.0 (http://onionnetworks.com/; i386; Linux) Content-Length: 78691253 Content-MD5: 84lI1a9IFPJq7jb3YG3m9Q== Content-Type: audio/shn ETag: "3360009-4b0bbb5-3d8b447e" Last-Modified: Fri, 20 Sep 2002 15:53:34 GMT Client-Date: Mon, 31 Mar 2003 18:50:22 GMT Client-Peer: 209.237.232.89:8080 X-Content-URN: urn:md5:6OEURVNPJAKPE2XOG33WA3PG6U X-Content-URN: urn:sha1:VTHQINIP3JUPJIMMC5RLVZSEFKMQ5KLX X-Content-URN: urn:tree:tiger:S6SMQPZXUD7G54ZPIJMXJPN7JAABQXM2ZCKIUEQ X-First-Bytes: 616a6b6702fbb17009f9255952a4d1a8dc48766a1157a0d5a8b66b6dd241108040201018040a0144d64020110d8c0a0104804420164b0dd2c3a08766a11ec0070000000b8000622efb1fb66659b36d85b45d6d77d3f081756652a563b41c94dbc24ce97b31fd3e4094415a862558d6756102e987170e9c591f2a5428dcc84bc43b21554c1444fe9a306fa2e9450125e78931c15f346cc6597762d6557c68623bc99254bdeaaf470888a9e104d631cca938cf0132314e7547b94069c86106060ea012e8d9c4c3211e99b4d3618070e33359a76670f85cc449e08468ec15ecf4e64e03d3dfb976c324444a9cf31ec599682060769e4e23bf9fce1ad3ffef94be4b X-Observed-IP: 24.118.168.169 X-Thex-URI: http://gw1.open-content.net:8080/gateway/thex?uri=http://etree01.archive.org/etree/moe1997-03-28dnk.shnf/moe1997-03-28d1/moe1997-03-28d1t06.shn;S6SMQPZXUD7G54ZPIJMXJPN7JAABQXM2ZCKIUEQ Andre John Mas wrote: > > Hi, > > Mirroring a web site or ftp site is a great way of reducing load > and improving access times. The only thing though is that there is > no method for telling a web browser to automatically go to a mirror. > For this reason I have been thinking that a 'mirrors.txt' file might > be of use at the root of a web site that is either the master or a > mirror, in the same way that a robot.txt file is made available. > > Follows is an example of what the contents of such a file would contain: > > ----start of example > #this is a comment > > title: Project Gutenberg > description: Project Gutenberg is the Internet's oldest producer of FREE > electronic books (eBooks or eTexts). > master: http://gutenberg.net/ > search: master > > mirror.name: University of North Carolina - HTTP > mirror.city: Chapel Hill > mirror.state: North Carolina > mirror.country: USA > mirror.gridref: > mirror.url: http://www.ibiblio.org/gutenberg/ > mirror.update.freq: daily > mirror.comment: Main Project Gutenberg Collection Site > > mirror.name: University of North Carolina - FTP > mirror.city: Chapel Hill > mirror.state: North Carolina > mirror.country: USA > mirror.gridref: 0/+1000,-1000 > mirror.url: ftp://ibiblio.org/pub/docs/books/gutenberg/ > mirror.update.freq: daily > mirror.comment: Main Project Gutenberg FTP Site -- If it doesn't allow > access, please try the corresponding HTTP site above > > ----end of example > > Most of the fields should be self explaining, though for the less > obvious: > - search: values would be mirror or master. This is important if > only the master offers a search facility > - mirror.gridref: the grid coordinates of the mirror. The slash > is there for a future use, such as defining planet ID as prefix. > The grid ref would always be the last child. I know this is > overkill, and probably no one will take this seriously, but I > would like to make this future proof, if there is no extra cost. > - mirror.update.freq: how oftern the mirror is updated (should this > be a numerical, textual value or both?) > > Some sites mirror several others, so the site would probably need more > than one mirror file. Two suggestions are to have the additional mirror > files have a numeric suffix, e.g. mirrors.txt, mirrors2.txt, etc. or > to have a mirrors.txt file that refers to the other mirror.txt files. > > Also, search engines, such as Google, could make use of this information > to tie together mirrors under one link, to make for smarter navigation. > Something such as: > > PROJECT GUTENBERG - > Project Gutenberg is the Internet's oldest producer of FREE > electronic books (eBooks or eTexts). > gutenberg.org/ - 18k - Master - Closest Mirror - Other Mirrors > > This is a first jab at something that could well be of use, so I would > certainly appreciate your comments and whether this is something that > could be added as a web standard? > > regards > > Andre > > P.S. I am not associated with Project Gutenberg, I am just using it as > a useful example of real site that could benefit from such a solution. > > -- Justin Chapweske, Onion Networks http://onionnetworks.com/
Received on Monday, 31 March 2003 13:56:40 UTC