- From: Boris Zbarsky <bzbarsky@MIT.EDU>
- Date: Wed, 14 Dec 2011 16:52:02 -0500
- To: public-webapps@w3.org
On 12/14/11 3:15 AM, Boris Zbarsky wrote: > Yeah, understood. Working on getting that description. Ok. It's just a simple spider that starts with the list at http://code.google.com/p/httparchive/source/browse/trunk/lists/All.txt and for each of those urls loads the url itself and then follows all same-host links from that page. So loads the front page of the site and all the same-host one-level-deep pages. -Boris
Received on Wednesday, 14 December 2011 21:52:40 UTC