On Fri, 2003-01-03 at 18:18, Dan Brickley wrote: > > * Edd Dumbill <edd@usefulinc.com> [2003-01-03 20:35+0000] > > I'm fine with a crawler, you're right about the bandwidth requirements > > -- especially as each day there are at most 4 files which change in the > > chump hierarchy (front page, archived day, month, year.) > > > > A simple Perl script could mirror the chump stuff quite happily. > > Yep, Gerald's right. I guess I figured the laziest, simplest thing > was just grabbing the single tar.gz, but probably best to get the 4 pages > instead. Just in case I though I would toss this out there... Is the data for the 4 pages available as RDF? If so, I would be willing to contribute a script that would crawl the pages and keep an InformationStore (or TripleStore) up-to-date so that the RDF is archived. And could also contribute code for generating an HTML view of the RDF etc. -- Daniel Krech, http://eikeon.com/ Redfoot.net, http://redfoot.net/ RDFLib.net, http://rdflib.net/Received on Friday, 3 January 2003 18:36:59 GMT
This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 7 July 2008 08:08:50 GMT