Re: video and long text descriptions / transcripts from Silvia Pfeiffer on 2012-04-07 (public-html-a11y@w3.org from April 2012)

From: Silvia Pfeiffer <silviapfeiffer1@gmail.com>
Date: Sat, 7 Apr 2012 21:52:43 +1000
To: David Singer <singer@apple.com>
Cc: HTML Accessibility Task Force <public-html-a11y@w3.org>
Message-ID: <CAHp8n2m+pkU7rUjtmAqTc40_yn5F-JxzEcAKtyBOckPV56AKNQ@mail.gmail.com>
On Sat, Apr 7, 2012 at 3:24 AM, David Singer <singer@apple.com> wrote:
> Silvia
>
> I think we have a fundamental difference of approach here.
>
> My observation: we have a number of places where all users (and their UAs) would benefit from knowing of a relationship between two elements, notably a media element and non-timed alternatives.  So my taste is to find a simple, general, mechanism that can express that, and then we just need to (incrementally) decide on the set of relationships we need to express.  My current feeling is that we need "transcript" and "long description", as they are distinct.  Maybe, down the road, the web will getg flooded with opera aficionados and we'll need "plot summary" as well, for example. I think this leads to a flexible, layered, design, and permits expression of nuance (that a long description and a transcript are different, for example).


I appreciate that need, too. However, I do not think that this is
satisfying an accessibility need. It is a much broader use case for
sighted users as well as non-sighted users. Let me try and recap your
needs from my perspective and you can tell me if I understand
correctly.

Primarily what you are saying is that you want a means to associate a
list of links to text equivalents for a video to the video element.
Further you are saying that just listing the links below the video is
not a good enough mechanism of association - you want something that
is tighter associated and can be machine-discoverable, as well as
discoverable by AT. Further you are saying that every link needs to
expose what the relationship is that the text association has with the
video element.

I think the solution for your problem already exists. It comes in the
combination of @aria-describedby , hyperlinks, and microdata. Here is
an example:

<video controls aria-describedby="summary transcript script">
  <source="_path_to_WebM_file_" type="video/webm">
  <source="_path_to_mp4_file_" type="video/mp4">
  <track="_path_to_WEBVTT_file_" kind="captions">
    (etc.)
</video>
<p id=summary>
This is some text summary of the video.
</p>
<a id=transcript href="_link_to_transcript_"
itemprop="transcript">Video Transcript</a>
<a id=script href="_link_to_script_" itemprop="script">Video Script</a>

In this markup a machine can discover that a linked document exists
for the video and what its relationship is (e.g. the transcript link
and the script link). Is that sufficient?


> Your taste, as I understand it, is to be much more specific, but unless the specific solution is actually more simple than the general one (and in this case, I don't see that) I am not sure I see what other advantages there are...


My "taste" is that this is an accessibility mailing list and I am
particularly focused on solving accessibility needs. The one
accessibility need that we haven't solved yet for videos and that we
can solve through markup is the one for deaf-blind users. I want a
simple solution for a deaf-blind user with a single link that that
user can follow and will find a comprehensive alternative to the full
video. I don't want that user to have to go through a potentially
large number of links and have to look at each to find out which is
the one they should read to get their text representation.


>> Who for? Who would make use of this information? Which one would the
>> screen reader use?
>>
>
> Whichever it, or the user, likes.  Go ahead, be inventive.  With luck, the regular UAs will expose the link(s) well enough that the need for specialist accessibility UAs would decrease.

You're suggesting to just introduce a single means of associating URLs
to a video. This would mean that the long description for the
deaf-blind users would be associated in the same way as all the other
text alternatives, e.g.


<video controls aria-describedby="longdesc_video">
  <source="_path_to_WebM_file_" type="video/webm">
  <source="_path_to_mp4_file_" type="video/mp4">
  <track="_path_to_WEBVTT_file_" kind="captions">
    (etc.)
</video>
<a id=longdesc_video href="_link_to_longdesc_"
itemprop="longdesc">Video Transcript</a>

We can do that. But in this case we cannot have a keyboard shortcut
such as SHIFT+ENTER to activate the longdesc. Because of this, I would
rather prefer having this as a direct link on the video element with
some extra functionality that is possible because it is special.
Essentially, I want the longdesc to be a shortcut for the above
markup, added with a visual indication and a keyboard activation.

Then we can do the following type of markup:

<video controls aria-describedby="summary transcript script"
aria-describedat="_link_to_transcript_">
  <source="_path_to_WebM_file_" type="video/webm">
  <source="_path_to_mp4_file_" type="video/mp4">
  <track="_path_to_WEBVTT_file_" kind="captions">
  <track="_path_to_WEBVTT_file_" kind="descriptions">
    (etc.)
</video>
<p id=summary>
This is some text summary of the video.
</p>
<a id=transcript href="_link_to_transcript_"
itemprop="transcript">Video Transcript</a>
<a id=script href="_link_to_script_" itemprop="script">Video Script</a>

You may notice that the transcript link is used twice here. That's
because I am overloading its use to be both a long description for the
deaf-blind and to be a explicitly linked transcript on the page. That
could be avoided if users were aware that the SHIFT+ENTER key will
always give them the transcript.

With such a markup, all users get to "watch" the video in its
comprehensive-ness: sighted users the video file, blind users the
video file + the description, deaf users the video file + the
captions, and deaf-blind users the transcript. If they want any of the
other text alternatives, they find them nearby, which is even
announced by the screenreader.

Does this work?

Cheers,
Silvia.
Received on Saturday, 7 April 2012 11:53:31 UTC