- From: Nick Kew <nick@webthing.com>
- Date: Thu, 16 May 2002 21:56:30 +0100 (BST)
- To: David Woolley <david@djwhome.demon.co.uk>
- cc: <w3c-wai-ig@w3.org>
On Thu, 16 May 2002, David Woolley wrote: > > > > is there a w3c service that crawls a site and reports errors, in > > planning perhaps? > > That's best done with a local tool. A W3C service could easily be > used as a denial of service attack aid. The key point is that any crawler should operate slowly so as not to risk overloading a server. One page per minute is a common rule-of-thumb for well-behaved robots. This is obviously not compatible with an online service that spiders while you wait. > You can also mirror the site using wget, which does respect the "robots" > protocol, then validate the local copy. wget runs on rapid-fire too. The Site Valet spider does exactly what you're asking for, spidering a site over time and compiling results which can be emailed to you, queried online with a browser, or both. -- Nick Kew Available for contract work - Programming, Unix, Networking, Markup, etc.
Received on Thursday, 16 May 2002 16:56:35 UTC