Re: Question about web spiders... from Lachlan Hunt on 2005-06-26 (www-html@w3.org from June 2005)

From: Lachlan Hunt <lachlan.hunt@lachy.id.au>
Date: Sun, 26 Jun 2005 11:44:31 +1000
To: Jasper Bryant-Greene <jasper@bryant-greene.name>
CC: Peter Kupfer <peter.kupfer@sbcglobal.net>, www-html@w3.org
Message-ID: <42BE087F.6070701@lachy.id.au>

Jasper Bryant-Greene wrote:
> Peter Kupfer wrote:
> 
>>[snip]
>>Questions:
>>
>>1) Is there another way to accomplish what I am trying to accomplish?
>>2) Does the W3C plan to implement the <nofollow> tag or anything like it
>>in the near future?
>>
>>I want to be standards compliant, but I also want to be able to tell a
>>spider where it can and can not go.
> 
> I'm not sure about your specific spider, but the commonly accepted way
> to do what you describe is something like:
> 
> <a href="http://www.example.org/" rel="nofollow">Link</a>

That actually does not do what its name suggests; the spider is free to 
follow the link.  It was actually designed to indicate that the link 
should not be counted in the page rank algorithm.

The correct way to control the way a spider indexes your site is to use 
robots.txt, assuming the spider in question implements it.

> That's perfectly standards compliant, and Googlebot obeys that, as well
> as several other major spiders AFAIK.

It is not standards compliant at all.  It's a proprietary extension that 
just happens to pass DTD based validation.  nofollow was discussed quite 
extensively on this list when Google introduced it and the vast majority 
of this community rejected it.

-- 
Lachlan Hunt
http://lachy.id.au/

Received on Sunday, 26 June 2005 01:44:36 UTC