Definition of Non-Text Content, Etc. from Eric Hansen on 2000-09-19 (w3c-wai-ua@w3.org from July to September 2000)

From: Eric Hansen <ehansen7@hotmail.com>
Date: Tue, 19 Sep 2000 06:41:51 EDT
To: w3c-wai-ua@w3.org
Cc: ehansen@ets.org
Message-ID: <F248YXifWoqFXfyY0QU000110e5@hotmail.com>
To: UA List (w3c-wai-ua@w3.org)
From: Eric Hansen
Re: Non-Text Content, Text Element, Etc.

Suggestion 1: Provide a definition of "non-text content", etc.

In keeping with the working group's resolution in the last teleconference 
[1] to "Adopt a definition of non-text content, text element, non-text 
element", following is my wording.

New:

"Non-Text Content, Text Content, Non-Text Element, Text Element"

"The term non-text content in this document refers to _content_ that is 
composed of one or more "non-text elements". Per checkpoint 1.1 of WCAG 1.0, 
the content author must ensure that there is a "text equivalent" for each 
non-text element in author-supplied content. Similarly, the developer of a 
user agent must ensure that a text equivalent is available for any non-text 
element produced by the user agent for the user (see checkpoint 1.5). The 
term "text content" in this document refers to content that is composed of 
one or more "text elements.""

"A "text element" is an element that, when rendered, is understandable in 
_each_ of the three modes: (1) visually-displayed text (for person who is 
deaf and adept in reading visually-displayed text); (2) synthesized speech 
(for a person who is blind and adept in use of synthesized speech); and (3) 
braille (for a person who is deaf-blind and adept at reading braille). A 
text element may contain markup for structure (e.g., heading levels), and 
style (e.g., font size or color), and so on. However, the essential function 
of the text element should be retained even if style information happens to 
be lost in rendering."

"A non-text element is an element that _fails_ to be understandable when 
rendered in _any_ of three modes to their respective disability audiences."

"Note that the terms text element and non-text element are defined by the 
characteristics of their output (rendering) rather than those of their 
input. For example, in principle, a text equivalent can be generated or 
encoded in any fashion as long as it has the proper output characteristics. 
[EH Question: Do we need to say: "However, in many cases, text elements are 
encoded as text -- with or without markup."? Or do we need, for practical or 
other reasons, to otherwise constrain the format of the text equivalent?] A 
text equivalent may be understood as "pre-rendering" content in contrast to 
the "post-rendering" content that it produces (visually-displayed text, 
synthesized speech, braille).

A _text equivalent_ is a text element that, when rendered, serves 
essentially the same function as some other content (e.g., "primary" 
content) does for a person without any disability (see definition of 
_alternative equivalents_).

Comment 1:

I think that it is imprecise and misleading to define text equivalents as 
"unrendered content" and the content that they produce (visually-displayed 
text, synthesized speech, and braille) as the "rendered content". While 
those terms are often correct in particular instances, they don't rise to 
the level of defining characteristics. I think that the terms "pre-rendering 
content" and "post-rendering content", while perhaps a bit unfamiliar, are 
correct.

Comment 2:

I think that the definition of text element could very appropriately include 
the requirements of WCAG 1.0 checkpoint 14.1 ("Use the clearest and simplest 
language appropriate for a site's content. [Priority 1]"). For example, if a 
text equivalent fails to adhere to checkpoint 14.1, the whole document fails 
to conform at the WCAG 1.0. However, I am not sure that it is essential in 
this context.

Comment 3:

Another possibly essential part of the definition could be as follows: "The 
'visually-displayed text' must be composed of one or more characters from 
any of the standard [?] character sets. That is, if one disregards stylistic 
features of the text, the visually displayed text will look like standard 
graphical representations of the characters (or glyphs) in the character 
set." One could also affirm similar things for braille output, thought I 
don't think that we have to do this for this document. See suggestion this 
memo regarding the definition of "text".

Comment 4:

I don't think that the term "text content" now appears in the document other 
than in this definition. However, it is found in WCAG 1.0 in a way that 
seems consistent with the proposed definition ("Text content can be 
presented to the user as synthesized speech, braille, and visually-displayed 
text. Each of these three mechanisms uses a different sense -- ears for 
synthesized speech, tactile for braille, and eyes for visually-displayed 
text -- making the information accessible to groups representing a variety 
of sensory and other disabilities.").

Comment 5:

Under the proposed definitions, user agents must expect that _markup_ may be 
found in text equivalents. However, I don't think that we normally think of 
values of the "alt" attribute as containing markup. Should we specify that 
user agents need to be ready to recognize the contents either as: plain 
text, as marked up text, as a URI of a file to open or execute? How can they 
be expected to recognize them as such?

====

Suggestion 2: Fix checkpoint 7.5.

Change the term "rendered text content" simply to "text rendered from the 
DOM". The term "text content" has a special meaning (i.e., content that is 
composed of one or more text elements).

Old (1 September 2000):

"7.5 Allow the user to search for rendered text content, including rendered 
text equivalents. Allow the user to start a forward search from a location 
in content selected or focused by the user. After a match, allow searching 
from location of the match. Provide a case-insensitive search option when 
applicable to the natural language of text. [Priority 2]"

New:
"7.5 Allow the user to search for text rendered from the DOM, including 
rendered text equivalents. Allow the user to start a forward search from a 
location in content selected or focused by the user. After a match, allow 
searching from location of the match. Provide a case-insensitive search 
option when applicable to the natural language of text. [Priority 2]"
====
Suggestion 3: Fix the definition of "Configure and Control".
Change the term "all text content" simply to "all text from the DOM". The 
term "text content" has a special meaning (i.e., content that is composed of 
one or more text elements).

Old (1 September 2000):

"For example, users may configure the user agent to apply the same font 
family across Web resources, so that all text content is displayed by 
default using that font family. Or, the user may wish to configure the 
rendering of a particular element type, which may be done through style 
sheets."

New:

"For example, users may configure the user agent to apply the same font 
family across Web resources, so that all text from the DOM is displayed by 
default using that font family. Or, the user may wish to configure the 
rendering of a particular element type, which may be done through style 
sheets."

====

Suggestion 4: Fix the definition of "Content".

Old (1 September 2000):

"Content"
"In this specification, the term "content" is used in two ways:"
"1. Content refers to the document object as a whole or in parts. Phrases 
such as "content type", "text content", and "language of content" refer to 
this usage. When used in this sense, the term content encompasses equivalent 
alternatives. Refer also to the definition of rendered content. and other 
accessibility information."
"2. Content refers to the content of an HTML or XML element, in the sense 
employed by the XML 1.0 specification ([XML], section 3.1): "The text 
between the start-tag and end-tag is called the element's content." Context 
should indicate that the term content is being used in this sense."

New:

"Content"
"In this specification, the term "content" is used in three ways:"
"1. Content refers to the DOM document object as a whole or in parts. 
Phrases such as "content type" and "language of content" refer to this 
usage. When used in this sense, the term content encompasses equivalent 
alternatives. Refer also to the definition of rendered content <NOTE PERIOD 
DELETED> and other accessibility information."
"2. Content refers to the content of an HTML or XML element, in the sense 
employed by the XML 1.0 specification ([XML], section 3.1): "The text 
between the start-tag and end-tag is called the element's content." Context 
should indicate that the term content is being used in this sense."
"3. Content is used in the context of the phrase "non-text content" and 
"text content". See definition of _non-text content_."

Comment 1:

As far as I know, in the current document, all instances of meaning #3 are 
also instances of meaning #1. However, I think that the definitions must be 
separate.

Comment 2:

I think we need to make clear that when we refer to "content", we are 
referring to the DOM (i.e., DOM2), not to any other kind of  "document 
object". (No Subject can conform unless it exports the DOM2 DOM.)

Comment 3:

I wonder if the term "content type" really should be "media type". Or does 
DOM2 specifically refer to "content types"?

====

Suggestion 5: Fix the definition of Document Object.

Old (1 September 2000):

Document Object, Document Object Model
The document object is the user agent's representation of data (e.g., a 
document). This data generally comes from the document source, but may also 
be generated (from style sheets, scripts, transformations, etc.) or produced 
as a result of preferences set within the user agent. Some data that is part 
of the document object is routinely rendered (e.g., in HTML, what appears 
between the start and end tags of elements and the values of attributes such 
as "alt", "title", and "summary"). Other parts of the document object are 
generally processed invisibly by the user agent, such as DTD-defined names 
of element types and attributes, and other attribute values such as "href", 
"id", etc. These guidelines require that users have access to both types of 
data through the user interface.
A document object model is the abstraction that governs the construction of 
the user agent's document object. The document object model employed by 
different user agents will vary in implementation and sometimes in scope. 
Nevertheless, this document calls for developers of user agents to adhere to 
the W3C Document Object Model (DOM), which specifies a standard interface 
for accessing HTML and XML content. This standard interface allows authors 
to access and modify the document with a scripting language (e.g., 
JavaScript) in a consistent manner across different scripting languages. As 
a standard interface, use of a W3C DOM makes it easier not just for authors 
but for assistive technology developers to extract information and render it 
in ways most suited to the needs of particular users. The relevant W3C DOM 
Recommendations are listed in the references. In this specification, the 
acronym "DOM" refers to the W3C DOM.

New:

Document Object, Document Object Model

In general usage, the term "document object" refers to the user agent's 
representation of data (e.g., a document). This data generally comes from 
the document source, but may also be generated (from style sheets, scripts, 
transformations, etc.) or produced as a result of preferences set within the 
user agent. Some data that is part of the document object is routinely 
rendered (e.g., in HTML, what appears between the start and end tags of 
elements and the values of attributes such as "alt", "title", and 
"summary"). Other parts of the document object are generally processed 
invisibly by the user agent, such as DTD-defined names of element types and 
attributes, and other attribute values such as "href", "id", etc. These 
guidelines require that users have access to both types of data through the 
user interface.
A "document object model" is the abstraction that governs the construction 
of the user agent's document object. The document object model employed by 
different user agents will vary in implementation and sometimes in scope.
In the context of requirements of this document, the terms "document object" 
and "document object model" (DOM) refer specifically to the W3C DOM2, which 
specifies a standard interface for accessing HTML and XML content. This 
standard interface allows authors to access and modify the document with a 
scripting language (e.g., JavaScript) in a consistent manner across 
different scripting languages. As a standard interface, use of a W3C DOM 
makes it easier not just for authors but for assistive technology developers 
to extract information and render it in ways most suited to the needs of 
particular users. The relevant W3C DOM Recommendations are listed in the 
references.

====

Suggestion 6: Fix checkpoint 6.1.

Make clear which conformance levels of WCAG 1.0 and ATAG 1.0 are intended. 
Is it a Priority 1 UAAG requirement to ensure that features that support 
_all three_ WCAG and ATAG priority levels are implemented. I am not sure 
whether or not I think that all three levels is too stringent. (I can see 
that this is different than the case of documentation in which we decided 
the double-A WCAG conformance was adequate.)  But in any case, I think that 
there needs to be a note indicating which priority levels are intended.

Old (1 September 2000):

"6.1 Implement the accessibility features of all supported specifications 
(markup languages, style sheet languages, metadata languages, graphics 
formats, etc.). Accessibility features are those identified in the 
specification and those features of the specification that support 
requirements of the "Web Content Accessibility Guidelines 1.0" [WCAG10], the 
"Authoring Tool Accessibility Guidelines 1.0" [ATAG10], and the current 
document. [Priority 1]"
"Note: This checkpoint applies to all specifications, not just W3C 
specifications. The Techniques document [UAAG10-TECHS] provides information 
about the accessibility features of some specifications, including W3C 
specifications."
"Techniques for checkpoint 6.1"

====

Suggestion 7: Add a definition of "text".

Per our earlier discussion [1], there should be a definition of "text".

I think that we need to be very thoughtful and deliberate about adding such 
a definition. If we specify some set of character sets, then what do we mean 
-- (a) that those sets must _all_ be supported or (b) that user agents are 
_not required_ to support anything outside of them or (c) that they _cannot_ 
support anything outside them?

If we name a specific set of character sets as what we mean, then we need to 
carefully examine the contexts of its use. I am not sure how fully various 
character sets are supported in braille and synthesized speech. Perhaps we 
need not have the definition specifically refer to braille and synthesized 
speech renderings.

Without necessary naming a particular set of character sets, it might be 
valuable to simply make clear that when we refer to text we are referring to 
the stuff of _character sets_ rather than to things like sign language 
videos. (The term 'text' is sometimes used outside the document to encompass 
such divergent things.)

====

Suggestion 8: Confirm the adequacy of the word "content" as used on 
checkpoints 2.5, 2.6, and 3.8.

I believe that is the foregoing changes are made, then the current wording 
of checkpoints 2.5, 2.6, and 3.8 appears adequate.

[1] http://lists.w3.org/Archives/Public/w3c-wai-ua/2000JulSep/0387.html

======================

_________________________________________________________________________
Get Your Private, Free E-mail from MSN Hotmail at http://www.hotmail.com.

Share information about yourself, create your own public profile at 
http://profiles.msn.com.
Received on Tuesday, 19 September 2000 06:42:23 UTC