Definition of information (was RE: non-text content)

"Information" is a tricky word. It involves 1.1 as well as 1.3, and of
course is implicated in everything we do.
 
As currently written, Guideline 1.3 says:
<blockquote>
Ensure that information, structure, and functionality are separable from
presentation."
</blockquote>
 
Joe has proposed rewriting the guideline so that it no longer uses the
word "information" but instead says "something like the following::
<blockquote>
Whenever markup or languages permit, ensure that
structure, presentation, and behaviour are
separated to the extent possible for the content.

</blockquote>
 
Gregg, wrestling with the proposed definition  of non-text content,
writes:
<blockquote>
How about "information" for "content"
 
Some believe structure is not information - others worry that if you
remove structure from, say, a table, its meaning changes.   That would
indicate that
you removed information necessary to understanding of the content.  It
may be information about information but it has semantic content.
 
Otherwise I would go for
 
   * non-text information - information that is not represented by a
Unicode
 
      character or linear presentation of Unicode characters
 
still worry about the word information a bit.

</blockquote>
 
 
In light of all this, I thought it might be worth seeing what WordNet
had to say about information:
<blockquote>
The noun "information" has 5 senses in WordNet.
 
1. information, info -- (a message received and understood)
2. data, information -- (a collection of facts from which conclusions
may be drawn; "statistical data")
3. information -- (knowledge acquired through study or experience or
instruction)
4. information, selective information, entropy -- ((communication
theory) a numerical measure of the uncertainty of an outcome; "the
signal contained thousands
of bits of information")
5. information -- (formal accusation of a crime)
</blockquote>
 
It seems to me that the only sense of the word we *don't* use when we
talk about "information" in the context of WCAG is #5,  the one about
lodging a formal accusation of criminal behavior (hmmm, maybe there are
times when we should ...?<grin>)
 
I suspect that we often confuse senses 1-4 when we talk about
"information" in the context of WCAG.  We slide back and forth between
talking about (sense 2) data that appear on the page, "facts from which
conclusions may be drawn"; and (sense 4)  the bits, binary digits, that
encode the data (sense 2) for transmission;  and from there we slip into
thinking about (sense 3) the knowledge we've acquired through study or
instruction, which is the message (sense 1) that we *want* to
transmit...
 
Everything in the delivery unit (in every delivery unit) is
"information" in sense 4: text (ASCII, Unicode, whatever), graphics,
audio, Flash animations, PDFs, MathML-- whatever it is, if it's on the
Web it's "information" in that sense.  That's part of what Al Gilman
meant when he insisted to me after the f2f that Web content is
technology-- it doesn't exist *as* Web content if it isn't encoded as
information (sense 4).
 
Gregory Bateson said that information is "news of difference that makes
a difference."  Compression algorithms depend on that-- they operate on
the assumption that not all bits are created equal, that some carry more
information than others and that those that carry less information can
be dispensed with, sometimes in ways that allow them to be recovered at
the other end.  The quantity of information in this context is measured
in terms of the uncertainty that a given message or piece of a mesage
can resolve.  I think this is why you can't remove the markup from a
data table without losing the semantics of the table-- the markup is
information that describes (and, in the context of the Web page,
actually creates) the relationships among the data.
So... I think what we're after here is that we want the information
(senses 1, 2, and 3) to be encoded (sense 4) in such a way that it can
be transmitted across a number of different channels (visual, auditory,
etc., etc.) without losing any of the "difference that makes a
difference" (Gregory Bateson's definition of "information").
 
I think, too, that what emerged from the call today (thanks to Wendy)
was that what appears in the *perceivable unit* as information (senses
1, 2, and 3) is controlled by but not the same as some of the
"information" that appears in the *delivery unit.*  The delivery unit
contains the markup, the information *about* the information; that
information controls how the information (senses 1, 2, and 3) is
presented in the perceivable unit.
 
I agree with Joe that no one is likely to publish a page that consisted
entirely of markup-- <p></p> and <h1></h1> pairs with nothing in between
them.  Such a page would transmit no information because there is
nothing for the markup to differentiate itself from. So the page would
be like those in the old tech manuals: [This page intentionally left
blank]
 
The word "content" is similarly slippery. Bringing it in through the
back door (as in Joe's phrase "... to the extent possible for the
content") doesn't really solve anything.
 
On the other hand, the definition of structure that's now in our
glossary *can* be read as including "information" (all 4 senses) and
"content" (in the various slippery ways we use the term), so the first
part of Joe's proposal may work.  Here's how we define structure:
 
<blockquote cite="http://www.w3.org/WAI/GL/WCAG20/#structuredef">
1. The way the parts of an authored unit are organized in relation to
each other and;
2. The way a collection of authored units is organized in relation to a
delivery unit and;
3. The way a collection of delivery units is organized

</blockquote>
 
Our definition of "authored unit," taken from the Device Independence
glossary, is:
 
<blockquote>
Some set of material created as a single entity by an author. Examples
include a collection of markup, a style sheet, and a media
resource,
such as an image or audio clip.
 
 
</blockquote>
 
As for the phrase "non text content," I don't think we can substitute
"information" for "content" here-- it would make 1.1 L1 SC2 sound very
odd indeed:
 
<shudder>
For non-text information that conveys information, text alternatives
convey the same information as the non-text information.
</shudder>
 
John"Good design is accessible design."

Dr. John M. Slatin, Director 
Accessibility Institute
University of Texas at Austin 
FAC 248C 
1 University Station G9600 
Austin, TX 78712 
ph 512-495-4288, fax 512-495-4524 
email jslatin@mail.utexas.edu 
Web <http://www.ital.utexas.edu/> http://www.utexas.edu
<http://www.utexas.edu/research/accessibility> /research/accessibility 

	-----Original Message-----
	From: w3c-wai-gl-request@w3.org
[mailto:w3c-wai-gl-request@w3.org] On Behalf Of Gregg Vanderheiden
	Sent: Thursday, April 21, 2005 4:53 PM
	To: w3c-wai-gl@w3.org
	Subject: non-text content
	
	

	    * non-text content - content that is not represented by a
Unicode

	      character or sequence of Unicode characters 

	 

	 

	I like the direction here.     But we need to handle

	1- content includes structure so the word "content" is
problematic here.

	2- ascii (or Unicode) art.

	 

	How about "information" for "content" 

	Some believe structure is not information - others worry that if
you remove structure from, say, a table, its meaning changes.   That
would indicate that you removed information necessary to understanding
of the content.  It may be information about information but it has
semantic content.

	 

	Otherwise I would go for 

	 

	   * non-text information - information that is not represented
by a Unicode

	      character or linear presentation of Unicode characters 

	 

	 

	still worry about the word information a bit.

	
	Gregg
	
	------------------------

	Gregg C Vanderheiden Ph.D. 
	Professor - Depts of Ind. Engr. & BioMed Engr.
	Director - Trace R & D Center 
	University of Wisconsin-Madison 
	<http://trace.wisc.edu/ <http://trace.wisc.edu/> > FAX
608/262-8848  
	For a list of our list discussions http://trace.wisc.edu/lists/

	  <http://trace.wisc.edu:8080/mailman/listinfo/> 

	 

	 

Received on Friday, 22 April 2005 15:44:07 UTC