On Sun, 2004-04-04 at 08:46, MichaelJennings wrote: > http://www.htmlhelp.com/ > HTTP Code returned: 403 > HTTP Message: Forbidden > Actually, I think if you try the URL you'll find it is > not only permitted, but pretty good competition. Really? david@cyberman david $ pavuk -identity "W3C-checklink/3.9.2 [3.17] libwww-perl/5.64" http://www.htmlhelp.com/ http://www.htmlhelp.com/ URL[ 1]: 1(0) of 1 http://www.htmlhelp.com/ download: ERROR: forbidden HTTP request Certainly seems to be forbidden to me. I don't know why htmlhelp.com blocks the link checker, but I wouldn't be surprised if it was something to do with the way it (the link checker) ignores the robots exclusion standard. david@pils:~$ tail -f /hosts/dorward.me.uk/logs/access.log | grep robot ... nope, doesn't request robots.txt and recursively goes into http://dorward.me.uk/notes/ despite: User-agent: * Disallow: /tmp/ Disallow: /images/ Disallow: /notes/ Disallow: /lib/ -- David Dorward <http://dorward.me.uk/>Received on Sunday, 4 April 2004 05:39:23 GMT
This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 7 December 2009 10:57:12 GMT