Re: Recently discovered issue with WCAG2ICT definition of "document" - suggesting a new note to clarify from Peter Korn on 2013-07-04 (public-wcag2ict-tf@w3.org from July 2013)

From: Peter Korn <peter.korn@oracle.com>
Date: Wed, 03 Jul 2013 22:56:52 -0700
To: Gregg Vanderheiden <gv@trace.wisc.edu>
CC: David MacDonald <david100@sympatico.ca>, public-wcag2ict-tf@w3.org, Gregg Vanderheiden <ez1testing@gmail.com>, kirsten@can-adapt.com
Message-ID: <51D50EA4.8090304@oracle.com>
Gregg, David,

I think where we are getting tripped up is around the common-sense 
concept of what a document is, vs. files that could contain information 
that in some fashion gets displayed to a user, at some point, by software.

I think about files used internally by some software to persist the user 
interface (see the last example paragraph in SC 4.1.1 
<http://www.w3.org/WAI/GL/wcag2ict/#ensure-compat-parses>: "Examples of 
markup used internally for persistence of the software user interface 
that are never exposed to assistive technology include: XUL, GladeXML, 
and FXML. In these examples assistive technology only interacts with the 
user interface of generated software.").  These files define a software 
program's user interface - the contents of the menus and toolbars and 
dialog boxes.  But for the fact that they happen to exist as a separate 
file on disk, they are simply part of the software program as shipped, 
and we don't treat them as documents.  If instead of being encoded in 
ASCII/UNICODE, they were in binary form, nobody would be the wiser that 
these files weren't executable programs. *We don't think of these XML UI 
definition files as "documents" for the purposes of WCAG2ICT.*  We don't 
attempt to apply all of the success criteria to them separately; *they 
are simply a part of the software program and they are covered through 
the evaluation the software program*.  If there is something missing in 
them needed for accessibility (e.g. ALT text for the icon in the 
toolbar), that causes the software to fail a success criterion, then the 
software simply fails the SC.

Similarly, a virus definition file that had embedded within it the names 
of known viruses and the names of places they appear - which may get 
displayed by the user when a virus is found - is really part of the 
anti-virus application (as periodically updated by the vendor).  If they 
were binary files that were delivered as "software patches" we wouldn't 
think of them as documents.  That they happen to have filenames encoded 
in ASCII/UNICODE should make no difference.  As with the 
XUL/GladeXML/FXML example in the paragraph above, they are simply a part 
of the software program and they are covered through the evaluation of 
the software program.  If there is something missing in them, that 
causes the software to fail a success criterion, then the software 
simply fails the SC.  It doesn't matter from which software file the 
failure arises.

Finally, if someone were to write a program (and defined the 
accompanying database) that stored & retrieved documents, the fact that 
the storage mechanism is in a database file (or collection of files) is 
no different than if instead the "file" was a filesystem on a disk 
drive.  If you have ever run virtualization software like VirtualBox, 
you may notice that the "hard drive" that gets created for your virtual 
machine is in fact a file in the filesystem of the underlying platform.  
That "hard drive" file will contain any number of documents (and 
programs and so forth).  That doesn't make the hard drive file */itself 
/*a document (anymore than a database into which someone has stored 
documents thereby becomes itself a document).  We don't apply WCAG2ICT's 
success criteria to the VirtualBox hard drive file in the underlying 
platform.


So... assuming we all agree with those three paragraphs above, the 
question becomes how best to state this.

Gregg - the approach you are advocating puts a constraint on the types 
of files: they avoid being called "documents" only if they "do not 
present information to users through a user agent" (this is because of 
where you have placed the comma).  But since we have redefined content 
from what it was in WCAG - to remove the term "user agent" from it - we 
have content being any "information and sensory experience to be 
communicated to the user by means of software".  So we have something 
that is circular.


Maybe I can get at this another way: by making clear that where files 
that are simply part of software happen to contain "information and 
sensory experience to be communicated to the user", you don't consider 
those separate files to be documents, but instead apply WCAG2ICT to that 
software (and the content rendered by it, where ever it may have come 
from).  See the new 2nd sentence below:

    *Note 3: Software configuration and storage files such as databases
    and virus definitions, as well as computer instruction files such as
    source code, batch/script files, and firmware,are not examples of
    documents. If and where software retrieves "information and sensory
    experience to be communicated to the user" from such files, those
    files contribute to content
    <http://www.w3.org/WAI/GL/wcag2ict/#keyterms_content> that occurs in
    software (and WCAG2ICT applies to that software
    <http://www.w3.org/WAI/GL/wcag2ict/#keyterms_software>).
    *


David - in the case of your example of a database containing 
documents... if the document is never available separately (e.g. the 
software program that stores/retrieves/displays the document from the 
database is the only way a user can ever read & interact with the 
document), then I claim it isn't a document.  If this were a closed 
system (e.g. a kiosk) displaying canned information stored entirely 
inside it (not retrieved over the web), we would only evaluate it as 
software (with closed functionality).  We wouldn't attempt to say that 
the kiosk's information was contained one or more documents that can be 
separately evaluated - that information is opaque to us.

Now, if/when a document is retrieved from a database and emitted into a 
stand-alone form that can separately be retrieved and presented by a 
user agent (e.g. I've obtained a Word file from Microsoft Sharepoint and 
stored a snapshot of it on my local hard drive), then that */becomes /*a 
document and it can be separately evaluated as such.  But the datastore 
maintained by Microsoft Sharepoint (containing any number of documents 
and document revisions, in any number of snapshots and states), isn't 
itself a document.  It is a file that is internal to the application.


Peter


On 7/3/2013 6:40 PM, Gregg Vanderheiden wrote:
>
> On Jul 3, 2013, at 8:22 PM, Peter Korn <peter.korn@oracle.com 
> <mailto:peter.korn@oracle.com>> wrote:
>
>> Gregg,
>>
>> Your suggestion leads to circular reasoning.
>>
>> The problem with this route is then any time we have some information 
>> in some file somewhere, and that information is the source in some 
>> fashion of "content", the software that presents it becomes a user 
>> agent?  And the file becomes a document?
>
> If the information is displayed to users -- it IS content.  and if the 
> database contains the text and images to display -- then it HAS to 
> contain the alternate text for the images. (The app displaying the 
> data can't add alt text itself - it doesn’t know what they data is til 
> display time)
>
> So this is exactly what we WANT it to say.
>
>>
>> So if my virus definition file contains the names of viruses, and 
>> those names are displayed in my anti-virus program, the anti-virus 
>> program is now a user agent?  And the virus definition file is now a 
>> document?
>
> Absolutely.   And if the virus definition files used icons instead of 
> text to 'name' the viruses - the virus definition file would have to 
> have alt text for those icons.
>
> And if there is any other non-text information to be displayed to the 
> user-- the virus definition file would need to have the text 
> alternative so the application could provide that text as well in a 
> programmatically determinable way.
>
>>
>>
>> That makes no sense.
>
> Make sense now?
>
>
> G
>
>
>
>
>
>
>
>>
>>
>> Peter
>>
>> On 7/3/2013 6:18 PM, Gregg Vanderheiden wrote:
>>> how about instead of raw - we pick up on the key distinction.
>>>
>>>     *(New) Note 3: Software configuration and storage files such as
>>>     databases and virus definitions, as well as computer instruction
>>>     files such as source code, batch/script files, and firmware,
>>>     that do not present information to users through a user
>>>     agent are not examples of documents.  Such files are not
>>>     "information and sensory experience to be communicated to the
>>>     user" and therefore are not considered content*
>>>
>>>
>>> If a database IS just data that a user agent displays- then it WOULD 
>>> be covered.  One could argue that an html file is sourcecode for the 
>>> page rendering.  Certainly the javascript is.
>>>
>>>
>>> /Gregg/
>>> --------------------------------------------------------
>>> Gregg Vanderheiden Ph.D.
>>> Director Trace R&D Center
>>> Professor Industrial & Systems Engineering
>>> and Biomedical Engineering University of Wisconsin-Madison
>>> Technical Director - Cloud4all Project - http://Cloud4all.info 
>>> <http://cloud4all.info/>
>>> Co-Director, Raising the Floor - International - 
>>> http://Raisingthefloor.org <http://raisingthefloor.org/>
>>> and the Global Public Inclusive Infrastructure Project - 
>>> http://GPII.net <http://gpii.net/>
>>>
>>> On Jul 3, 2013, at 7:50 PM, Peter Korn <peter.korn@oracle.com 
>>> <mailto:peter.korn@oracle.com>> wrote:
>>>
>>>> David,
>>>>
>>>> What makes a file "raw"?  I view the situation of a program 
>>>> retrieving data from somewhere and presenting it within it's user 
>>>> interface as "content" that is displayed in software.  Said content 
>>>> must be accessible.  Said content could come from a database file.  
>>>> Said content could be a persisted user interface (cf. SC 4.1.1 
>>>> <http://www.w3.org/WAI/GL/wcag2ict/#ensure-compat-parses>).  And 
>>>> just like the 4.1.1 case (addressing your PS in the following 
>>>> e-mail), there could be information in that file that helps with 
>>>> accessibility (e.g. the database contains images and also ALT text 
>>>> for those images).
>>>>
>>>> But we aren't loosing anything here - whatever is in the database 
>>>> that winds up being presented in a user interface is content that 
>>>> must be accessible.  If it isn't accessible when presented in 
>>>> software, WCAG2ICT catches it.
>>>>
>>>> But it doesn't make sense to try to apply all of WCAG to a database 
>>>> file as if it was a web page or a word processing file.  That's the 
>>>> point here.
>>>>
>>>>
>>>> Peter
>>>>
>>>> On 7/3/2013 5:43 PM, David MacDonald wrote:
>>>>>
>>>>> Just one nit...
>>>>>
>>>>>
>>>>> Can we add the word “raw” or some other word to make it clearer...
>>>>>
>>>>> **
>>>>>
>>>>> *... raw storage files such as databases*
>>>>>
>>>>>
>>>>> I’m a little nervous it might make the pendulum swing the other 
>>>>> way and some administrators might think it’s not a document if a 
>>>>> user agent serving up content from a database on the backend...
>>>>>
>>>>>
>>>>> Cheers
>>>>>
>>>>> David MacDonald
>>>>>
>>>>> **
>>>>>
>>>>> *Can**Adapt**Solutions Inc.*//
>>>>>
>>>>> /Adapting the web to *all* users/
>>>>>
>>>>> /Including those with disabilities/
>>>>>
>>>>> www.Can-Adapt.com <http://www.can-adapt.com/>
>>>>>
>>>>>
>>>>> *From:*Peter Korn [mailto:peter.korn@oracle.com]
>>>>> *Sent:* July-03-13 6:59 PM
>>>>> *To:* public-wcag2ict-tf@w3.org Force
>>>>> *Subject:* Recently discovered issue with WCAG2ICT definition of 
>>>>> "document" - suggesting a new note to clarify
>>>>>
>>>>> Hi gang,
>>>>>
>>>>> As part of a wider review of WCAG2ICT (asking colleagues who 
>>>>> aren't on the Task Force to look at it), I just discovered an 
>>>>> issue with the definition of "document 
>>>>> <http://www.w3.org/WAI/GL/wcag2ict/#keyterms_document>". The issue 
>>>>> is that readers will see the term "document" and think "file", and 
>>>>> therefore try to apply WCAG requirements to all manner of files 
>>>>> (virus definition files and programming files were two specific 
>>>>> concerns that came up from colleagues).
>>>>>
>>>>> While our definition of "document" is based on the term "content 
>>>>> <http://www.w3.org/WAI/GL/wcag2ict/#keyterms_content>" (which is 
>>>>> scoped to "information and sensory experience to be communicated 
>>>>> to the user"), I fear this fact is too easily missed.  Therefore, 
>>>>> I propose that we add an additional Note to clarify this:
>>>>>
>>>>> Note: Software configuration and storage files such as databases 
>>>>> and virus definitions, as well as computer instruction files such 
>>>>> as source code, batch/script files, and firmware, are not examples 
>>>>> of documents. Such files are not "information and sensory 
>>>>> experience to be communicated to the user" and therefore are not 
>>>>> considered content.
>>>>>
>>>>> I have added that note in context, as proposed "(New) Note 3" in 
>>>>> red text as part of the full definition of document, below:
>>>>>
>>>>>     *document (as used in WCAG2ICT)*
>>>>>
>>>>>     assembly of content
>>>>>     <http://www.w3.org/WAI/GL/wcag2ict/#keyterms_content>, such as
>>>>>     a file, set of files, or streamed media that is not part of
>>>>>     software and that does not include its own user agent
>>>>>
>>>>>     *Note 1:***A documents always requires a user agent to present
>>>>>     its content to the user.
>>>>>
>>>>>     *Note 2:***Letters, spreadsheets, emails, books, pictures,
>>>>>     presentations, and movies are examples of documents.
>>>>>
>>>>>     *(New) Note 3: Software configuration and storage files such
>>>>>     as databases and virus definitions, as well as computer
>>>>>     instruction files such as source code, batch/script files, and
>>>>>     firmware, are not examples of documents.  Such files are not
>>>>>     "information and sensory experience to be communicated to the
>>>>>     user" and therefore are not considered content.*
>>>>>
>>>>>     *Note 3**4**:***Anything that can present its own content
>>>>>     without involving a user agent, such as a self playing book,
>>>>>     is not a document but is software.
>>>>>
>>>>>     *Note 4**5**:***A single document may be composed of multiple
>>>>>     files such as the video content, closed caption text, etc.
>>>>>     This fact is not usually apparent to the end-user consuming
>>>>>     the document / content. This is similar to how a single web
>>>>>     page can be composed of content from multiple URIs (e.g. the
>>>>>     page text, images, the JavaScript, a CSS file etc.).
>>>>>
>>>>>
>>>>>
>>>>> I would like to propose this edit as part of the WCAG WG review 
>>>>> next Tuesday July 9th, so it can get into the 3rd/final public 
>>>>> draft that we publish later in July.
>>>>>
>>>>> Any thoughts/edits before I do this as part of my WCAG WG 
>>>>> "Ultimate? Survey" 
>>>>> <https://www.w3.org/2002/09/wbs/35422/Ultimate/> response?
>>>>>
>>>>>
>>>>> Peter
>>>>>
>>>>> -- 
>>>>> <Mail Attachment.gif> <http://www.oracle.com/>
>>>>> Peter Korn | Accessibility Principal
>>>>> Phone: +1 650 5069522 <tel:+1%20650%205069522>
>>>>> 500 Oracle Parkway | Redwood City, CA 94064
>>>>> <Mail Attachment.gif> <http://www.oracle.com/commitment>Oracle is 
>>>>> committed to developing practices and products that help protect 
>>>>> the environment
>>>>>
>>>>
>>>> -- 
>>>> <oracle_sig_logo.gif> <http://www.oracle.com/>
>>>> Peter Korn | Accessibility Principal
>>>> Phone: +1 650 5069522 <tel:+1%20650%205069522>
>>>> 500 Oracle Parkway | Redwood City, CA 94064
>>>> <green-for-email-sig_0.gif> <http://www.oracle.com/commitment> 
>>>> Oracle is committed to developing practices and products that help 
>>>> protect the environment
>>>
>>
>> -- 
>> <oracle_sig_logo.gif> <http://www.oracle.com/>
>> Peter Korn | Accessibility Principal
>> Phone: +1 650 5069522 <tel:+1%20650%205069522>
>> 500 Oracle Parkway | Redwood City, CA 94064
>> <green-for-email-sig_0.gif> <http://www.oracle.com/commitment> Oracle 
>> is committed to developing practices and products that help protect 
>> the environment
>

-- 
Oracle <http://www.oracle.com>
Peter Korn | Accessibility Principal
Phone: +1 650 5069522 <tel:+1%20650%205069522>
500 Oracle Parkway | Redwood City, CA 94065
Green Oracle <http://www.oracle.com/commitment> Oracle is committed to 
developing practices and products that help protect the environment
Received on Thursday, 4 July 2013 05:57:56 UTC