Re: page and site complexity measures [was Re: Web Content Acc

> measurement is the basis of science.

I understand what you mean but you really do have to be careful that 
you're measuring what you really want to measure.  You may prove 
anything by figures.  And I'm not convinced that a simple word-counting 
algorithm can reliably say how easy it is to understand a page.

Eg. the word 'incomprehensibilities' has 21 letters but most English 
people know what it means.  But the word 'wan' has three letters and I'm 
surprised how many people don't know it (or at least have to think).

And what about this: "One day Tanya went for a walk", etc etc (story 
mentioning Tanya hundreds of times).  Suppose Tanya is not considered to 
be one of the words in the limited vocabulary, and the author gets told 
off for using it so much?  That kind of thing would be enough to put me 
off using such a tool.

I can see that there would be a loose correlation between the 
understandability of a document to a particular group of people and its 
statistics, but this does not mean that a statistics tool can label a 
document "guaranteed readable" or "guaranteed unreadable" with 100% 
accuracy.  There is a danger in that authors might make changes to 
decrease their difficulty level according to the tool, but in so doing 
actually render the document harder to understand because they are using 
an unreliable tool.

You can't have science without measurement, but there are so many 
variables here.  You wouldn't be able to do much science if the only 
instrument you had gave you a single reading, being the average of 
temperature, pressure, current, voltage, weight etc.  But if you had 
lots of instruments and you don't understand science, you'd be 
bewildered.

I think the best test of "is a page easy to understand" is to try it out 
on someone.

Regards

-- Silas S Brown, St John's College Cambridge UK http://epona.ucam.org/~ssb22/

"They get caught by the ideas that they have thought up" - Psalm 10:2

Received on Thursday, 4 March 1999 04:38:54 UTC