Re: video and long text descriptions / transcripts from John Foliot on 2012-04-08 (public-html-a11y@w3.org from April 2012)

From: John Foliot <john@foliot.ca>
Date: Sat, 07 Apr 2012 20:03:04 -0400
To: David Singer <singer@apple.com>
Cc: Silvia Pfeiffer <silviapfeiffer1@gmail.com>, HTML Accessibility Task Force <public-html-a11y@w3.org>
Message-ID: <20120407200304.11981mde0gx6hj94@wats.ca>
Quoting David Singer <singer@apple.com>:

> I think we have a fundamental difference of approach here.

+1

>
> My observation: we have a number of places where all users (and  
> their UAs) would benefit from knowing of a relationship between two  
> elements, notably a media element and non-timed alternatives.  So my  
> taste is to find a simple, general, mechanism that can express that,  
> and then we just need to (incrementally) decide on the set of  
> relationships we need to express.  My current feeling is that we  
> need "transcript" and "long description", as they are distinct.

I have to agree here as well.

Silvia, as I have previously indicated, there is a distinct function  
of the Accessible.Description role which serves a specific need with  
regard to the Accessibility APIs. You keep insisting that the full  
transcript is the equivalent of this long description, when in fact,  
in the realm of accommodations to people with disabilities, the  
transcript is actually an "Alt Format" accommodation (for example, see  
here: http://www.altformat.org/)

Other examples of "Alt Format" production includes "talking books"  
(whether in the Daisy format, or a direct recording available via CS,  
cassette, etc.), or a tactile map (often created with special,  
heat-reacting ink that produces a 3-dimensional rendering of a  
standard 2-D map, image or other graphic -  
http://blindreaders.info/mapgraph.html), etc.

Alt Format(s) are complete, functional replacements for the  
traditional or 'mainstream' material. They are not "descriptions" of  
the media they are replacing, they are the actual replacements. To our  
end users, there is a huge difference here.


> Maybe, down the road, the web will get flooded with opera  
> aficionados and we'll need "plot summary" as well, for example. I  
> think this leads to a flexible, layered, design, and permits  
> expression of nuance (that a long description and a transcript are  
> different, for example).

I agree. Silvia, earlier you asked where else do we have multiple  
(text) files associated to an asset, that the end user gets to choose  
from?  The answer is in <video> itself: <track> takes the @kind  
attribute so that we can differentiate between the French sub-title  
and the Italian sub-title, and the English closed caption. While these  
files are time-stamped to be synchronized with the video, they remain  
"textual" in nature, in that they paint text onto the video region,  
and one of the stated goals of WebVTT is that these files can be  
styled (using CSS) the same as any other text on screen (presumably  
including the ability to use @font-face... ugh) - the point is, for  
the end user it will be text on screen (as opposed to bit-map images);  
as well, we already have examples of descriptive text files being  
voiced via speech synthesizers, which are processing 'standard' text  
files, albeit now synched with an external time-line.

I had previously inquired about extending the <track> element (and/or  
more specifically the @kind attribute) to address the needs that are  
now surfacing, and so I will re-pose that question: outside of the  
fact that some files would be time-stamped, and others not, why  
*can't* we use <track> + @kind to provide the type of linkage we are  
seeking? It would seem to me that extending the contextual menu that  
is being used to offer the choice of French or Italian sub-titles  
could also easily offer a text transcript or Opera libretto [sic]. Why  
not kind="transcript" or kind="libretto"? (We would also have the  
advantage that any and all of these alternative textual files would be  
discoverable and selectable by *all* users, using a standard browser  
control).


>>
>> Who for? Who would make use of this information? Which one would the
>> screen reader use?
>>
>
> Whichever it, or the user, likes.

+1  I've long argued choice is good; why restrict the end user to only  
one choice?


JF
Received on Sunday, 8 April 2012 00:03:35 UTC