W3C home > Mailing lists > Public > w3c-wai-gl@w3.org > October to December 2003

RE: Search engine and alt text: [techs] Latest HTML Techniques Draft

From: Tom Croucher <tcroucher@netalleynetworks.com>
Date: Wed, 3 Dec 2003 11:35:49 +0000
To: Jens Meiert <jens.meiert@erde3.com>
Message-Id: <D7F58AF6-2584-11D8-862F-003065C9C000@netalleynetworks.com>
Cc: w3c-wai-gl@w3.org


 >>> Question:  Do search engines really  scan alt text given for images 
  ?

 ><snip />

 >> one of my co-workers was architect at alta-vista so I asked him; 
seems
 >> that 'most' search engines should be expected to parse text that is 
in
 >> elements and can be expected to parse certain attribute values, like
 >> alt, as well.

 >I also remember cases (when searching via Google) where 'alt' attribute
 >values were shown, but unfortunately I was neither able to reproduce 
this nor
 >have I found documentation about this right now.

Search engines and indexing is something I spent some time studying and 
as far as I am aware search engines do not use alt text in indexing in 
practice any longer. There are a few points related to this, the main 
reason why not to use it in indexing is to avoid 'non-visual stuffing', 
which is a practice where a page is filled with keywords in an attempt 
to aim a high ranking score on search engines. In order to not put 
users off this text was often hidden in the alt text of images or in 
text of the same  colour as the background. To avoid genuine results 
being offset by these techniques engines such as google largely ignore 
alt text, and even metadata in the <meta> tags. The other issue is that 
alt text is semantically dead. So while a 'real' text alternative as 
provided by the object tag can contain markup with <em>, <hX> and so on 
alt attribute text at the most can be one of these types if the images 
tag is wrapped by it. (i.e. <h1><img src="logo.jpg" alt="My company" 
/></h1>). I might add that most of the internal workings of each search 
engine are confidential and this is to some extent speculation.

Tom
Received on Wednesday, 3 December 2003 06:35:58 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 7 December 2009 10:47:26 GMT