Re: video and long text descriptions / transcripts

On Sun, Apr 8, 2012 at 10:03 AM, John Foliot <john@foliot.ca> wrote:
> Silvia, as I have previously indicated, there is a distinct function of the
> Accessible.Description role which serves a specific need with regard to the
> Accessibility APIs. You keep insisting that the full transcript is the
> equivalent of this long description, when in fact, in the realm of
> accommodations to people with disabilities, the transcript is actually an
> "Alt Format" accommodation (for example, see here:
> http://www.altformat.org/)

Interesting. I would have thought that the idea of the long
description link is indeed to provide a link to such an "alt format".
How else is blind person to understand what's in an image? Or a
deaf-blind user what's in a video? Is this really not what we want
from the long description???


> Other examples of "Alt Format" production includes "talking books" (whether
> in the Daisy format, or a direct recording available via CS, cassette,
> etc.), or a tactile map (often created with special, heat-reacting ink that
> produces a 3-dimensional rendering of a standard 2-D map, image or other
> graphic - http://blindreaders.info/mapgraph.html), etc.

I would have thought we would want to integrate such "alt formats" in
the Web and that this is the whole idea of aria-describedbAt. What use
is it to a Web user to not have the "alt format" at their fingertip
when browsing the Web?


> Alt Format(s) are complete, functional replacements for the traditional or
> 'mainstream' material. They are not "descriptions" of the media they are
> replacing, they are the actual replacements. To our end users, there is a
> huge difference here.

OK. This is shattering everything I've come to expect from
accessibility of Web applications: why would we not want to offer a
accessibility user an "alt format" directly on the Web? Why do we have
to force them to find functional replacements outside the Web?


>> Maybe, down the road, the web will get flooded with opera aficionados and
>> we'll need "plot summary" as well, for example. I think this leads to a
>> flexible, layered, design, and permits expression of nuance (that a long
>> description and a transcript are different, for example).
>
>
> I agree. Silvia, earlier you asked where else do we have multiple (text)
> files associated to an asset, that the end user gets to choose from?  The
> answer is in <video> itself: <track> takes the @kind attribute so that we
> can differentiate between the French sub-title and the Italian sub-title,
> and the English closed caption. While these files are time-stamped to be
> synchronized with the video, they remain "textual" in nature, in that they
> paint text onto the video region, and one of the stated goals of WebVTT is
> that these files can be styled (using CSS) the same as any other text on
> screen (presumably including the ability to use @font-face... ugh) - the
> point is, for the end user it will be text on screen (as opposed to bit-map
> images); as well, we already have examples of descriptive text files being
> voiced via speech synthesizers, which are processing 'standard' text files,
> albeit now synched with an external time-line.

The <track> element is an exception in this respect because we're
allowing internationalization to kick in as a selector for a set
number of time-aligned document types. This is a very different use
case.

My question was focused on where we are asking a blind user to select
between different alternative text representations that would equally
serve as a long description. Can you give me an example that is not
the (very special-cased) <track> element?


> I had previously inquired about extending the <track> element (and/or more
> specifically the @kind attribute) to address the needs that are now
> surfacing, and so I will re-pose that question: outside of the fact that
> some files would be time-stamped, and others not, why *can't* we use <track>
> + @kind to provide the type of linkage we are seeking?

For time-aligned documents you can use <track> with @kind=metadata.
For non-time-aligned documents <track> makes no sense - just like
publishing a table in a <canvas>, or an image in a <video>, or a list
of items in a <textbox> also make no sense: none of the associated
functionality matches with the use case.


> It would seem to me
> that extending the contextual menu that is being used to offer the choice of
> French or Italian sub-titles could also easily offer a text transcript or
> Opera libretto [sic]. Why not kind="transcript" or kind="libretto"? (We
> would also have the advantage that any and all of these alternative textual
> files would be discoverable and selectable by *all* users, using a standard
> browser control).

An unlimited list of different @kind values is not reasonable, in
particular if every different @kind requires the browsers to implement
special rendering techniques.


>>> Who for? Who would make use of this information? Which one would the
>>> screen reader use?
>>>
>>
>> Whichever it, or the user, likes.
>
>
> +1  I've long argued choice is good; why restrict the end user to only one
> choice?

"User choice" would be satisfied by providing a list of links under
the video. It does not require direct inclusion in the element. Such
an inclusion only makes sense if the interaction with that included
element/attribute is special, i.e. results in extra exposed
functionality by the browser such as keyboard shortcuts or rendering
as part of the video or exclusive exposure to a11y API. None of that
is the case here: we are looking to add a URL that will replace (or at
least be overlayed and therefore substantially replace) the current
Web page. It is a much looser associated content to the video than the
captions or subtitles or descriptions, which cannot be rendered
without the video itself.

Regards,
Silvia.

Received on Monday, 9 April 2012 13:45:14 UTC