Re: on measures of reading level from Charles McCathieNevile on 2001-05-04 (w3c-wai-gl@w3.org from April to June 2001)

From: Charles McCathieNevile <charles@w3.org>
Date: Fri, 4 May 2001 03:08:26 -0400 (EDT)
To: Matt May <mcmay@bestkungfu.com>
cc: <w3c-wai-gl@w3.org>
Message-ID: <Pine.LNX.4.30.0105040253540.7561-100000@tux.w3.org>
I agree that flesch-Kincaid and gunning fog are at best very rough guesses.
As you say, they reward short sentences made of short words. It is true that
this is only a rough measure. And I agree that to give good emasure, some
grammar checking helps. And spelling chicking is not quite as high a priority
- if the spellchecker can recognise the word most people will too, whatever
the spelling. But checking whether words are in a common dictionary (more or
les the same process but using a different dictionary) would be a useful
test.

If we could specify three or four simple tests at the same time, and provide
tools to do these tests, we would be well on the way to helping people get a
rough idea. Another test that Jonathan Chetwynd has often suggested is word
count. If there are more than 30 words on the page, he says, people that he
works with find it very hard.

This would also be useful.

(Note that all this is in theory required by WCAG 1.0...)

cheers

Charles McCN

On Thu, 3 May 2001, Matt May wrote:

  I've done some research on reading level algorithms, and I think what I've
  found is enough to give us pause with respect to putting any real faith in
  the numbers that are produced, or basing any guidelines on them.

  The first one I researched was the Flesch-Kincaid grade level. The algorithm
  is as follows:
  (.39 * w) + (11.8 * s) - 15.59
  where
  w is the average number of words per sentence, and
  s is the average number of syllables per word.
  Negative FK results are reported as zero. Numbers over 12 are reported as
  12.

  FK is based on the Flesch Reading Ease score, which is itself just a measure
  of syllable count and sentence length. The modified algorithm was published
  in 1975 by a researcher trying to create readable documents for enlisted
  personnel in the US Navy, and doesn't appear to have any real attachment to
  education, much less cognitive disability. The subjects in Kincaid's
  research were adults, presumably skewed 18-30 and male (and by definition
  skewed American), and the resulting algorithm really wasn't intended to be
  utilized as widely as it is.

  The other major grade-level index is the Gunning Fog index. Its algorithm
  is:
  (w + h) * .4
  where
  w is the average number of words per sentence, and
  h is the percentage of words with three or more syllables ("hard words")

  Gunning Fog is capped at 17, where "17-plus" is suggested as
  postgraduate-level writing.

  Both of these algorithms reward short, monosyllabic sentences, irrespective
  of how many of these sentences are necessary to communicate the point. These
  indices are also not meant to rate an entire document, but rather
  extrapolate scores from small passages (100 words).

  I picked a couple of random examples to illustrate results. The first
  sentence of Lincoln's Gettysburg address rates a 12 (the raw number is
  closer to 18) on FK, while Gwendolyn Brooks' poem "We Real Cool"[1] rates a
  0. I was a third-grader when I studied Lincoln, not a post-doctorate, and I
  think I still got the gist. :)

  Now, here's the part that bugs me. First off, by interjecting sentences like
  this previous one (short, no polysyllabics), I can lower the overall score
  of this message. And if the goal of a site is to work its way down to a
  prescribed reading level, then that is likely to be what they try to do:
  they'll boil down the content by working it until it comes up with a low
  enough score, and say that's that. No real usability or accessibility gain
  can be found by fostering this type of practice, where people are writing to
  the index, rather than to the reader.

  Secondly, syllable count is a really weak measure of complexity. Is
  "Germany" a more difficult message to communicate than "France"? Do
  7-year-olds know what a brad is? The idea is based on an assumption that
  longer words are harder than short ones. There's a correlation there, but
  it's not reliable. Just as relevant to comprehension are educational
  environment, cultural influence, and above all, context.

  My last problem is that none of these actually check spelling or grammar. It
  seems those might be somewhat relevant, as well.

  The information that I read seemed to suggest (where it didn't say outright)
  that reading-level indices have been used largely as pseudoscience: an
  overly simplistic "scientific" numeric answer to the readability of a
  system.

  I found a document[2] which lines out ten principles of clear statement. I
  think this is a lot closer to my ideal of providing content providers with
  solid guidance for good writing, and in fact, we may want to consider asking
  to incorporate these principles. This document comes from the creators of
  the Gunning Fog index, circa 1973, and at the end, it emphasizes that
  systems like the fog index, while of some utility, are not a panacea:

  "It is important not to over-use the fog index. Use it only occasionally to
  spot-check your writing. Don't write to make a good fog index score. That
  will make you write short, choppy sentences. Like these.
  Instead, learn and practice using the 'Ten Principles of Clear Statement.'
  If you observe these guides to good writing, your writing will naturally
  grow easier to understand."

  [1] http://www.poets.org/poems/poems.cfm?prmID=1233
  [2] http://muextension.missouri.edu/xplor/comm/cm0201.htm
  -
  m


-- 
Charles McCathieNevile    http://www.w3.org/People/Charles  phone: +61 409 134 136
W3C Web Accessibility Initiative     http://www.w3.org/WAI    fax: +1 617 258 5999
Location: 21 Mitchell street FOOTSCRAY Vic 3011, Australia
(or W3C INRIA, Route des Lucioles, BP 93, 06902 Sophia Antipolis Cedex, France)
Received on Friday, 4 May 2001 03:08:40 UTC