- From: Gerald Oskoboiny <gerald@w3.org>
- Date: Fri, 10 Apr 1998 17:22:14 -0400 (EDT)
- To: www-html-editor@w3.org
Hi, at http://www.w3.org/TR/REC-html40/appendix/notes.html#h-B.4.1.1 it says: > Some tips: URI's are case-sensitive, and "/robots.txt" string must be > all lower-case. Blank lines are not permitted. This last bit ("Blank lines are not permitted.") is incorrect, or at least quite misleading the way it is currently written. Blank lines *are* permitted in the robots.txt file, just not within a single "record". (though "record" doesn't seem to be defined anywhere here.) I still think it might be a good idea to cite some other source, like one of: http://www.kollar.com/robots.html http://info.webcrawler.com/mak/projects/robots/norobots-rfc.html http://info.webcrawler.com/mak/projects/robots/robots.html I also think we should resist the urge to include stuff like this in future specs; this section really doesn't seem to belong in an HTML spec at all! I understand it was probably put there because there aren't any other easily citable sources, but in that case I think we should quickly publish whatever material we want to reference as a NOTE and reference that, because at least that way it can be updated more easily if there are problems. Later in that same section, it says: > Robots and the META element > > The META element allows HTML authors to tell visiting robots whether a > document may be indexed, or used to harvest more links. No server > administrator action is required. > > In the following example a robot should neither index this document, > nor analyze it for links. > > <META name="ROBOTS" content="NOINDEX, NOFOLLOW"> > > The list of terms in the content is ALL, INDEX, NOFOLLOW, NOINDEX. > The name and the content attribute values are case-insensitive. Where are these terms defined? Thanks! Gerald -- Gerald Oskoboiny <gerald@w3.org> +1 617 253 2920 System Administrator, W3C http://www.w3.org/People/Gerald/ World Wide Web Consortium, MIT Laboratory for Computer Science 545 Technology Square, Room NE43-353 Cambridge MA 02139 USA
Received on Friday, 10 April 1998 17:22:16 UTC