Re: Recently discovered issue with WCAG2ICT definition of "document" - suggesting a new note to clarify from Loïc Martínez Normand on 2013-07-04 (public-wcag2ict-tf@w3.org from July 2013)

From: Loïc Martínez Normand <loic@fi.upm.es>
Date: Thu, 4 Jul 2013 12:06:41 +0200
To: Gregg Vanderheiden <gv@trace.wisc.edu>
Cc: Peter Korn <peter.korn@oracle.com>, David MacDonald <david100@sympatico.ca>, "public-wcag2ict-tf@w3.org" <public-wcag2ict-tf@w3.org>, Gregg Vanderheiden <ez1testing@gmail.com>, kirsten@can-adapt.com
Message-ID: <CAJpUyzmR88r_oLtHSJFSTtm7CNQWnRe7VNVkF+CkUariob+JMA@mail.gmail.com>
Dear all,

What a discussion! I just went to be minutes after receiving the first
email... and bang! I woke up with a very long thread.

I think that things are getting overcomplicated as the discussion has
progressed and I'm going to try to simplify.

But first I need to go back to the origin of the discussion. We have the
definitions of "content" and "document":

   - *content* (non-web content): information and sensory experience to be
   communicated to the user by means of software, including code or markup
   that defines the content’s structure, presentation, and interactions.
   - *document* (as used in WCAG2ICT): assembly of content, such as a file,
   set of files, or streamed media that is not part of software and that does
   not include its own user agent

First, lets not forget that the definition of content includes the code or
markup that defines the structure, presentation and interactions. That
means that we can have a file written in markup language that can be
considered to be a document.

Second, the important bit of the definition of document for this discussion
is that a document "is not part of software". I think that the files that
Peter has been talking about (configuration files, virus definition files,
internal databases) are in fact, part of software and thus are not
documents.

So my proposal for the new (shorter) note is:

*(New) Note 3: Software configuration and storage files such as databases
and virus definitions, as well as computer instruction files such as source
code, batch/script files, and firmware, are examples of files that are part
of software and thus are **not examples of documents.  *

What do you think? I don't think that we need to add text to explain that
it is the software who "contains" these files who need to be considered, do
we?

Best regards,
Loïc


On Thu, Jul 4, 2013 at 8:41 AM, Gregg Vanderheiden <gv@trace.wisc.edu>wrote:

> Wow -- this is getting long.
>
> I think I see another way around the problem.    (see below )
>
> First - what was the problem.
> - the problem comes from talking about a file that is "separate from the
> software"  (such as an update file or database) that is used by the
> software and subsequently  causes information not in the software to be
> displayed.     Is this a 'document?"
> - the concern was that if the software doesn’t know of the contents of the
> file in advance, then any new non-text content of the file that gets
> presented to a user  cannot be made accessible by the software.  nohow.
> So the file needs to follow the SC and itself provide the alternate form of
> the non-text content just like any html file for example.
>
> The language below (and previous versions) did not cover this -- and said
> that the software was responsible and the file did not need to follow the
> SC.   This is a problem.
>
> HOWEVER - I think we can get where you want to be by talking about the
> virus update etc as and UPDATE to the Software rather than a separate piece
> of content or 'document'.
>
> Something like this:
>
>
> *Note 3: Software configuration and storage files such as databases and
> virus definitions, as well as computer instruction files such as source
> code, batch/script files, and firmware, that are part of a software
> package, or an update to part of the software package are not examples of
> documents.  As with any update, if they include new non-text information
> for presentation to users, they would be expected to include accompanying
> alternate text presentations if the software doesn’t already have them or
> the ability to create them.  But they function as, and would be evaluated
> as, part(s) of the software and not as separate entities or as documents.*
>
> *
> *
>
> Does that address the problem - without creating a new one?
>
>
>
> *Gregg*
> --------------------------------------------------------
> Gregg Vanderheiden Ph.D.
> Director Trace R&D Center
> Professor Industrial & Systems Engineering
> and Biomedical Engineering University of Wisconsin-Madison
> Technical Director - Cloud4all Project - http://Cloud4all.info
> Co-Director, Raising the Floor - International -
> http://Raisingthefloor.org
> and the Global Public Inclusive Infrastructure Project -  http://GPII.net
>
> On Jul 4, 2013, at 12:56 AM, Peter Korn <peter.korn@oracle.com> wrote:
>
>  Gregg, David,
>
> I think where we are getting tripped up is around the common-sense concept
> of what a document is, vs. files that could contain information that in
> some fashion gets displayed to a user, at some point, by software.
>
> I think about files used internally by some software to persist the user
> interface (see the last example paragraph in SC 4.1.1<http://www.w3.org/WAI/GL/wcag2ict/#ensure-compat-parses>:
> "Examples of markup used internally for persistence of the software user
> interface that are never exposed to assistive technology include: XUL,
> GladeXML, and FXML. In these examples assistive technology only interacts
> with the user interface of generated software.").  These files define a
> software program's user interface - the contents of the menus and toolbars
> and dialog boxes.  But for the fact that they happen to exist as a separate
> file on disk, they are simply part of the software program as shipped, and
> we don't treat them as documents.  If instead of being encoded in
> ASCII/UNICODE, they were in binary form, nobody would be the wiser that
> these files weren't executable programs.  *We don't think of these XML UI
> definition files as "documents" for the purposes of WCAG2ICT.*  We don't
> attempt to apply all of the success criteria to them separately; *they
> are simply a part of the software program and they are covered through the
> evaluation the software program*.  If there is something missing in them
> needed for accessibility (e.g. ALT text for the icon in the toolbar), that
> causes the software to fail a success criterion, then the software simply
> fails the SC.
>
> Similarly, a virus definition file that had embedded within it the names
> of known viruses and the names of places they appear - which may get
> displayed by the user when a virus is found - is really part of the
> anti-virus application (as periodically updated by the vendor).  If they
> were binary files that were delivered as "software patches" we wouldn't
> think of them as documents.  That they happen to have filenames encoded in
> ASCII/UNICODE should make no difference.  As with the XUL/GladeXML/FXML
> example in the paragraph above, they are simply a part of the software
> program and they are covered through the evaluation of the software
> program.  If there is something missing in them, that causes the software
> to fail a success criterion, then the software simply fails the SC.  It
> doesn't matter from which software file the failure arises.
>
> Finally, if someone were to write a program (and defined the accompanying
> database) that stored & retrieved documents, the fact that the storage
> mechanism is in a database file (or collection of files) is no different
> than if instead the "file" was a filesystem on a disk drive.  If you have
> ever run virtualization software like VirtualBox, you may notice that the
> "hard drive" that gets created for your virtual machine is in fact a file
> in the filesystem of the underlying platform.  That "hard drive" file will
> contain any number of documents (and programs and so forth).  That doesn't
> make the hard drive file *itself *a document (anymore than a database
> into which someone has stored documents thereby becomes itself a
> document).  We don't apply WCAG2ICT's success criteria to the VirtualBox
> hard drive file in the underlying platform.
>
>
> So... assuming we all agree with those three paragraphs above, the
> question becomes how best to state this.
>
> Gregg - the approach you are advocating puts a constraint on the types of
> files: they avoid being called "documents" only if they "do not present
> information to users through a user agent" (this is because of where you
> have placed the comma).  But since we have redefined content from what it
> was in WCAG - to remove the term "user agent" from it - we have content
> being any "information and sensory experience to be communicated to the
> user by means of software".  So we have something that is circular.
>
>
> Maybe I can get at this another way: by making clear that where files that
> are simply part of software happen to contain "information and sensory
> experience to be communicated to the user", you don't consider those
> separate files to be documents, but instead apply WCAG2ICT to that software
> (and the content rendered by it, where ever it may have come from).  See
> the new 2nd sentence below:
>
> *Note 3: Software configuration and storage files such as databases and
> virus definitions, as well as computer instruction files such as source
> code, batch/script files, and firmware, are not examples of documents.  If
> and where software retrieves "information and sensory experience to be
> communicated to the user" from such files, those files contribute to
> content <http://www.w3.org/WAI/GL/wcag2ict/#keyterms_content> that occurs
> in software (and WCAG2ICT applies to that software<http://www.w3.org/WAI/GL/wcag2ict/#keyterms_software>
> ).
> *
>
>
> David - in the case of your example of a database containing documents...
> if the document is never available separately (e.g. the software program
> that stores/retrieves/displays the document from the database is the only
> way a user can ever read & interact with the document), then I claim it
> isn't a document.  If this were a closed system (e.g. a kiosk) displaying
> canned information stored entirely inside it (not retrieved over the web),
> we would only evaluate it as software (with closed functionality).  We
> wouldn't attempt to say that the kiosk's information was contained one or
> more documents that can be separately evaluated - that information is
> opaque to us.
>
> Now, if/when a document is retrieved from a database and emitted into a
> stand-alone form that can separately be retrieved and presented by a user
> agent (e.g. I've obtained a Word file from Microsoft Sharepoint and stored
> a snapshot of it on my local hard drive), then that *becomes *a document
> and it can be separately evaluated as such.  But the datastore maintained
> by Microsoft Sharepoint (containing any number of documents and document
> revisions, in any number of snapshots and states), isn't itself a
> document.  It is a file that is internal to the application.
>
>
> Peter
>
>
> On 7/3/2013 6:40 PM, Gregg Vanderheiden wrote:
>
>
>  On Jul 3, 2013, at 8:22 PM, Peter Korn <peter.korn@oracle.com> wrote:
>
>  Gregg,
>
> Your suggestion leads to circular reasoning.
>
> The problem with this route is then any time we have some information in
> some file somewhere, and that information is the source in some fashion of
> "content", the software that presents it becomes a user agent?  And the
> file becomes a document?
>
>
>  If the information is displayed to users -- it IS content.  and if the
> database contains the text and images to display -- then it HAS to contain
> the alternate text for the images. (The app displaying the data can't add
> alt text itself - it doesn’t know what they data is til display time)
>
>  So this is exactly what we WANT it to say.
>
>
> So if my virus definition file contains the names of viruses, and those
> names are displayed in my anti-virus program, the anti-virus program is now
> a user agent?  And the virus definition file is now a document?
>
>
>  Absolutely.   And if the virus definition files used icons instead of
> text to 'name' the viruses - the virus definition file would have to have
> alt text for those icons.
>
>  And if there is any other non-text information to be displayed to the
> user-- the virus definition file would need to have the text alternative so
> the application could provide that text as well in a programmatically
> determinable way.
>
>
>
> That makes no sense.
>
>
>  Make sense now?
>
>
>  G
>
>
>
>
>
>
>
>
>
> Peter
>
> On 7/3/2013 6:18 PM, Gregg Vanderheiden wrote:
>
> how about instead of raw - we pick up on the key distinction.
>
>   *(New) Note 3: Software configuration and storage files such as
> databases and virus definitions, as well as computer instruction files such
> as source code, batch/script files, and firmware, that do not present
> information to users through a user agent are not examples of documents.
> Such files are not "information and sensory experience to be communicated
> to the user" and therefore are not considered content*
>
>
>  If a database IS just data that a user agent displays- then it WOULD be
> covered.  One could argue that an html file is sourcecode for the page
> rendering.  Certainly the javascript is.
>
>
>     *Gregg*
> --------------------------------------------------------
> Gregg Vanderheiden Ph.D.
> Director Trace R&D Center
> Professor Industrial & Systems Engineering
> and Biomedical Engineering University of Wisconsin-Madison
>  Technical Director - Cloud4all Project - http://Cloud4all.info<http://cloud4all.info/>
> Co-Director, Raising the Floor - International -
> http://Raisingthefloor.org <http://raisingthefloor.org/>
> and the Global Public Inclusive Infrastructure Project -  http://GPII.net<http://gpii.net/>
>
>  On Jul 3, 2013, at 7:50 PM, Peter Korn <peter.korn@oracle.com> wrote:
>
>  David,
>
> What makes a file "raw"?  I view the situation of a program retrieving
> data from somewhere and presenting it within it's user interface as
> "content" that is displayed in software.  Said content must be accessible.
> Said content could come from a database file.  Said content could be a
> persisted user interface (cf. SC 4.1.1<http://www.w3.org/WAI/GL/wcag2ict/#ensure-compat-parses>).
> And just like the 4.1.1 case (addressing your PS in the following e-mail),
> there could be information in that file that helps with accessibility (e.g.
> the database contains images and also ALT text for those images).
>
> But we aren't loosing anything here - whatever is in the database that
> winds up being presented in a user interface is content that must be
> accessible.  If it isn't accessible when presented in software, WCAG2ICT
> catches it.
>
> But it doesn't make sense to try to apply all of WCAG to a database file
> as if it was a web page or a word processing file.  That's the point here.
>
>
> Peter
>
> On 7/3/2013 5:43 PM, David MacDonald wrote:
>
> Just one nit...****
>
>
> Can we add the word “raw” or some other word to make it clearer... ****
>
> * *
>
> *... raw storage files such as databases*****
>
>
> I’m a little nervous it might make the pendulum swing the other way and
> some administrators might think it’s not a document if a user agent serving
> up content from a database on the backend...****
>
>
> Cheers****
>
> David MacDonald****
>
> * *
>
> *Can**Adapt* *Solutions Inc.***
>
> *  Adapting the web to all users*
>
> *            Including those with disabilities*
>
> www.Can-Adapt.com <http://www.can-adapt.com/>****
>
>
> *From:* Peter Korn [mailto:peter.korn@oracle.com <peter.korn@oracle.com>]
> *Sent:* July-03-13 6:59 PM
> *To:* public-wcag2ict-tf@w3.org Force
> *Subject:* Recently discovered issue with WCAG2ICT definition of
> "document" - suggesting a new note to clarify****
>
> ** **
>
> Hi gang,
>
> As part of a wider review of WCAG2ICT (asking colleagues who aren't on the
> Task Force to look at it), I just discovered an issue with the definition
> of "document <http://www.w3.org/WAI/GL/wcag2ict/#keyterms_document>".
> The issue is that readers will see the term "document" and think "file",
> and therefore try to apply WCAG requirements to all manner of files (virus
> definition files and programming files were two specific concerns that came
> up from colleagues).
>
> While our definition of "document" is based on the term "content<http://www.w3.org/WAI/GL/wcag2ict/#keyterms_content>"
> (which is scoped to "information and sensory experience to be communicated
> to the user"), I fear this fact is too easily missed.  Therefore, I propose
> that we add an additional Note to clarify this: ****
>
> Note: Software configuration and storage files such as databases and virus
> definitions, as well as computer instruction files such as source code,
> batch/script files, and firmware, are not examples of documents.  Such
> files are not "information and sensory experience to be communicated to the
> user" and therefore are not considered content.****
>
> I have added that note in context, as proposed "(New) Note 3" in red textas part of the full definition of document, below:
> ****
>
> *document (as used in WCAG2ICT)*****
>
> assembly of content <http://www.w3.org/WAI/GL/wcag2ict/#keyterms_content>,
> such as a file, set of files, or streamed media that is not part of
> software and that does not include its own user agent****
>
> *Note 1:** *A documents always requires a user agent to present its
> content to the user.****
>
> *Note 2:** *Letters, spreadsheets, emails, books, pictures,
> presentations, and movies are examples of documents.****
>
> *(New) Note 3: Software configuration and storage files such as databases
> and virus definitions, as well as computer instruction files such as source
> code, batch/script files, and firmware, are not examples of documents.
> Such files are not "information and sensory experience to be communicated
> to the user" and therefore are not considered content.*****
>
> *Note 3**4**:** *Anything that can present its own content without
> involving a user agent, such as a self playing book, is not a document but
> is software.****
>
> *Note 4**5**:** *A single document may be composed of multiple files such
> as the video content, closed caption text, etc. This fact is not usually
> apparent to the end-user consuming the document / content. This is similar
> to how a single web page can be composed of content from multiple URIs
> (e.g. the page text, images, the JavaScript, a CSS file etc.).****
>
>
>
> I would like to propose this edit as part of the WCAG WG review next
> Tuesday July 9th, so it can get into the 3rd/final public draft that we
> publish later in July.
>
> Any thoughts/edits before I do this as part of my WCAG WG "Ultimate?
> Survey" <https://www.w3.org/2002/09/wbs/35422/Ultimate/> response?
>
>
> Peter****
>
> --
> <Mail Attachment.gif> <http://www.oracle.com/>
> Peter Korn | Accessibility Principal
> Phone: +1 650 5069522 <+1%20650%205069522>
> 500 Oracle Parkway | Redwood City, CA 94064
> <Mail Attachment.gif> <http://www.oracle.com/commitment>Oracle is
> committed to developing practices and products that help protect the
> environment ****
>
>
> --
> <oracle_sig_logo.gif> <http://www.oracle.com/>
> Peter Korn | Accessibility Principal
> Phone: +1 650 5069522 <+1%20650%205069522>
> 500 Oracle Parkway | Redwood City, CA 94064
> <green-for-email-sig_0.gif> <http://www.oracle.com/commitment> Oracle is
> committed to developing practices and products that help protect the
> environment
>
>
>
> --
> <oracle_sig_logo.gif> <http://www.oracle.com/>
> Peter Korn | Accessibility Principal
> Phone: +1 650 5069522 <+1%20650%205069522>
> 500 Oracle Parkway | Redwood City, CA 94064
> <green-for-email-sig_0.gif> <http://www.oracle.com/commitment> Oracle is
> committed to developing practices and products that help protect the
> environment
>
>
>
> --
> <oracle_sig_logo.gif> <http://www.oracle.com/>
> Peter Korn | Accessibility Principal
> Phone: +1 650 5069522 <+1%20650%205069522>
> 500 Oracle Parkway | Redwood City, CA 94065
> <green-for-email-sig_0.gif> <http://www.oracle.com/commitment> Oracle is
> committed to developing practices and products that help protect the
> environment
>
>
>


-- 
---------------------------------------------------------------
Loïc Martínez-Normand
DLSIIS. Facultad de Informática
Universidad Politécnica de Madrid
Campus de Montegancedo
28660 Boadilla del Monte
Madrid
---------------------------------------------------------------
e-mail: loic@fi.upm.es
tfno: +34 91 336 74 11
---------------------------------------------------------------
Received on Thursday, 4 July 2013 10:07:11 UTC