Re: HTML and Search Engines

Joe English (jenglish@crl.com)
Thu, 24 Aug 1995 13:19:43 -0700


Message-Id: <199508242020.AA22435@mail.crl.com>
To: www-html@w3.org
Subject: Re: HTML and Search Engines 
In-Reply-To: <Pine.A32.3.91.950824132352.127799C-100000@vth1.vth.colostate.edu> 
Date: Thu, 24 Aug 1995 13:19:43 -0700
From: Joe English <jenglish@crl.com>


Jay Kammerzell <jkammerz@vth1.vth.colostate.edu> wrote:

> Is there a way that standard document data could be included 
> in a referenced document for WWW search engines to glean rather than 
> taking the first few lines of that document?
> 
> It would seem that having some set of attributes that didn't display 
> but was available to a search engine could improve database searches and 
> make the returned information more meaningful. 

The canonical way to do this is to put:


	<META NAME="KEYWORDS" CONTENT="text to index here...">

in the document <HEAD>.  You can use as many <META NAME=KEYWORDS> elements
as you like.  Browsers won't display the text.

I don't know which, if any, current search engines actually
look in <META> elements.  Sophisticated engines that understand
HTML ought to check for META elements, and naive ones that just
index all text will pick up the keywords too.  

Moderately smart search engines that understand basic SGML syntax 
but not HTML semantics will most likely ignore keywords entered
in this way, since they're part of the markup.  Such engines are
also likely to ignore <!-- stuff in comments --> too, so there's
no way that I can think of to make text invisible to browsers but
visible to these.

Again, I have no idea how any of the common search engines actually
work...


--Joe English

  jenglish@crl.com