- From: Gerald Oskoboiny <gerald@w3.org>
- Date: Fri, 10 Apr 1998 17:22:14 -0400 (EDT)
- To: www-html-editor@w3.org
Hi, at
http://www.w3.org/TR/REC-html40/appendix/notes.html#h-B.4.1.1
it says:
> Some tips: URI's are case-sensitive, and "/robots.txt" string must be
> all lower-case. Blank lines are not permitted.
This last bit ("Blank lines are not permitted.") is incorrect, or
at least quite misleading the way it is currently written.
Blank lines *are* permitted in the robots.txt file, just not within
a single "record". (though "record" doesn't seem to be defined
anywhere here.)
I still think it might be a good idea to cite some other source,
like one of:
http://www.kollar.com/robots.html
http://info.webcrawler.com/mak/projects/robots/norobots-rfc.html
http://info.webcrawler.com/mak/projects/robots/robots.html
I also think we should resist the urge to include stuff like this in
future specs; this section really doesn't seem to belong in an HTML
spec at all! I understand it was probably put there because there
aren't any other easily citable sources, but in that case I think
we should quickly publish whatever material we want to reference
as a NOTE and reference that, because at least that way it can be
updated more easily if there are problems.
Later in that same section, it says:
> Robots and the META element
>
> The META element allows HTML authors to tell visiting robots whether a
> document may be indexed, or used to harvest more links. No server
> administrator action is required.
>
> In the following example a robot should neither index this document,
> nor analyze it for links.
>
> <META name="ROBOTS" content="NOINDEX, NOFOLLOW">
>
> The list of terms in the content is ALL, INDEX, NOFOLLOW, NOINDEX.
> The name and the content attribute values are case-insensitive.
Where are these terms defined?
Thanks!
Gerald
--
Gerald Oskoboiny <gerald@w3.org> +1 617 253 2920
System Administrator, W3C http://www.w3.org/People/Gerald/
World Wide Web Consortium, MIT Laboratory for Computer Science
545 Technology Square, Room NE43-353 Cambridge MA 02139 USA
Received on Friday, 10 April 1998 17:22:16 UTC