- From: Peter Kupfer <peter.kupfer@sbcglobal.net>
- Date: Sun, 26 Jun 2005 23:14:46 -0500
- To: Lachlan Hunt <lachlan.hunt@lachy.id.au>
- CC: www-html@w3.org
Lachlan Hunt wrote: > Peter Kupfer wrote: > >> Lachlan Hunt wrote: >> >>> The correct way to control the way a spider indexes your site is to >>> use robots.txt, assuming the spider in question implements it. >> >> In a robots.txt file can you control specifically what links a spider >> will follow on a certain page, > > No, it controls which pages on a server the spider can access. > >> or just that it won't go to a certain page. > > Essentially, yes. This is what I thought, so, as you concluded, a robots.txt won't fix my problem here. :( >> I want the spider to eventually hit each subdomain, just not from the >> home page, I have it start at each subdomain index? > > Then HTML is the wrong place to specify such behaviour and robots.txt is > probaly not suitable for you either. HTML is designed to markup the > semantics of the document's content by saying *what* the content is, not > describe how the content should be processed by a particular UA. Having > said that though, processing instructions [1] are designed to supply > system specific information, but I don't know how suitable they would be > for your particular needs. Fair enough. > > I don't understand why it matters which path is followed to reach > subdomains, but I think you need to find a way to configure the robot > itself, not try to give it instructions from within the documents it reads. With this service, freefind, it makes a site map, and depending on the path it takes through the site, varies how the site map is displayed. >>> nofollow was discussed quite extensively on this list when Google >>> introduced it and the vast majority of this community rejected it. >> >> I tried to search the archive, but didn't see it there, why was no >> follow rejected? > > Then you didn't look very hard. A search for "nofollow" in the archives > reveals most of the thread, appearing just below the messages from this > thread. For your convenience, it actually started with a message on > www-html-editor [2|3], with most of the followup discussion on www-html > [4]. > > [1] http://www.is-thought.co.uk/book/sgml-8.htm#PI > [2] http://lists.w3.org/Archives/Public/www-html-editor/2005JanMar/0010 > [3] > http://lists.w3.org/Archives/Public/www-html-editor/2005JanMar/thread#10 > [4] http://lists.w3.org/Archives/Public/www-html/2005Jan/thread#64 Perhaps. I searched for no follow, not in quotes and with a space, and I got subjects like, "XML tags are just a cheap rip-off of PHP tags" & "DC in XHTML2", and other things that were not what I wanted. I will go back and search "nofollow", it didn't occur to me to leave out the space. Thanks! -- Peter Kupfer peschtra@yahoo.com
Received on Monday, 27 June 2005 04:14:52 UTC