- From: Michael A. Peters <mpeters@mac.com>
- Date: Mon, 15 Jun 2009 23:02:48 -0700
- To: public-html-comments@w3.org
Hi - I hope this is the right list. I have a suggestion for a new attribute to potentially make it into (x)html standard. The attribute is for search engines, to instruct them not to index part of a page. What I'm currently doing in xhtml is this: <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd" [ <!ATTLIST div spider (on | off) #IMPLIED> ]> The added attribute is spider and takes a value of on or off. I'm using it in a modification of the open source sphider search engine I'm working on. The idea is to avoid using html comments to turn on/off indexing on part of a page. The actual attribute name and values of such an attribute is definitely open to discussion, but I think it should be non search crawler specific. Example of use - <p>This paragraph is indexed</p> <p spider="off">This paragraph is not indexed</p> <p>This paragraph is indexed</p> <div spider="off"> <p>This paragraph is not indexed</p> <p spider="on">This paragraph is indexed</p> </div> <img src="foo.jpg" alt="[This image is indexed]" /> <img src="bar.gif" spider="off" alt="[This image is not indexed]" /> Default is on unless the node or a parent node has turned it off. It would be useful for things like navigation areas, images/multimedia you specifically do not want engines to index, signature areas of bulletin boards, etc. Of course search engines would need their indexers to respect it, but that's why a standard attribute is very desirable. With a standard, many search engines would implement it as when properly used by the webmaster, it would improve the usefulness of the search engine. Thoughts?
Received on Tuesday, 16 June 2009 06:03:29 UTC