- From: Philip Taylor <pjt47@cam.ac.uk>
- Date: Thu, 17 Apr 2008 00:44:58 +0100
- To: "Bonner, Matt (IPG)" <matt.bonner@hp.com>
- CC: "public-html@w3.org" <public-html@w3.org>, "wai-xtech@w3.org" <wai-xtech@w3.org>, "wai-liaison@w3.org" <wai-liaison@w3.org>, Karl Dubost <karl@w3.org>
Bonner, Matt (IPG) wrote: > MB> It seems like gathering data from various sources would advance this > MB> debate more usefully than any amount of speculation on what might be. > > IH> What data would you like me to collect? > > Well, the data from a web crawl that seem germane would be > along the lines of percentages of images for the oft-mentioned > three cases: > > . have no alt attribute > . have an alt="" > . have an alt="(a descriptive string)" > > Obviously that still gives you no sense how often the alt > text is useful, but it's a start. [...] Unlike Google I have no money so there's no danger of me wasting any on a survey, but I've already got a sample of about 130,000 pages from the list on dmoz.org, so I looked at that for free. <img> elements with no alt attribute: 1104466 (47%) <img> elements with zero-length alt: 530687 (23%) <img> elements with non-empty whitespace-only alt: 11943 (1%) <img> elements with non-empty non-whitespace alt: 702702 (30%) > Le 17 avr. 2008 à 06:59, Karl Dubost a écrit : >> More challenging, distributions of "text", collect all the text >> contained in alts, sort them out, and then sees what are the text >> which are happening very often (I think about things like "logo" >> emerging, but there might be surprises). http://philip.html5.org/data/common-alt-values.txt (It's hard to tell much from that, since a single site with hundreds of pages listed on dmoz.org will significantly distort the results.) > additional one: > Distribution of text lengths http://philip.html5.org/data/alt-lengths.svg (The longest were about 10,000 characters - http://www.coalitionforjustice.net (looks like actually legitimate alternative text) and http://www.legnotre.com/ (looks like search engine keyword spam) - but I cut the graph off much earlier, since very few are longer than ~200 characters and it makes the graph more boring.) -- Philip Taylor pjt47@cam.ac.uk
Received on Wednesday, 16 April 2008 23:47:53 UTC