Re: Video Poster image (was RE: DRAFT analysis of fallback mechanisms for embedded content ACTION-66) from Maciej Stachowiak on 2010-12-03 (public-html-a11y@w3.org from December 2010)

From: Maciej Stachowiak <mjs@apple.com>
Date: Fri, 03 Dec 2010 14:01:16 -0800
To: John Foliot <jfoliot@stanford.edu>
Cc: 'Martin Kliehm' <martin.kliehm@namics.com>, 'Silvia Pfeiffer' <silviapfeiffer1@gmail.com>, 'HTML Accessibility Task Force' <public-html-a11y@w3.org>
Message-id: <5DADDBC9-1975-4204-9F71-2E26D189054A@apple.com>

On Dec 3, 2010, at 1:35 PM, John Foliot wrote:

> 
>> What is needed is a summary of the video that equally allows non-
>> sighted users to decide if they want to play it, just as the poster
>> frame (whether explicit or built-in) does for sighted users. 
> 
> This presumes that the poster frame will always be chosen to elicit that
> call-to-action. I am trying to explain that this may not always be the
> case - that the image chosen by any given author could serve an
> alternative purpose (whether branding, informational, or other) that is
> conceptually unrelated to a specific video, but meets other author
> needs/goals.
> 
> I agree that the video should have a summary, and even leave open the door
> that it could be explicit (@summary) or 'relative' (aria-describedby) -
> where here the Summary would appear as text on the page for both sighted
> and non-sighted users.
> 
> However that summary does not serve as the @alt value for the image being
> used - it can't, as then you are mixing oranges and apples. My video is
> not about "Stanford University - this video is closed captioned" it is
> about (whatever it is about). I have no disagreement that the author
> *could* add this information into a summary, but I must also concede that
> they might not, or that the text example I am using here is an imperfect
> example

If there is a summary mechanism, it is already an available mechanism to convey info about the poster frame. So this leaves authors with two options:

A) Have a single summary for the video, which includes relevant information from the poster frame.
B) Have a summary for the video, plus a separate "poster alt" which only describes/replaces the poster frame.

Can you explain to me how (B) will lead to a better user experience (using any assistive technology)? 

> 
> Maciej, I have tussled with this a fair bit and have even done my own
> sanity check to ensure that I am not misguided here, and the overwhelming
> consensus I get, from both accessibility specialists as well as blind
> users themselves, is that this is not off-track: we are dealing with 2
> discrete assets - a video and an image - and they both require the ability
> to have textual fallbacks. They may be conceptually closely related, but
> they can equally be conceptually unrelated, and that is the overarching
> use-case, when they are conceptually unrelated.

Thinking of the two assets as discrete is mistaken. They are multiple media items which combine to present what is conceptually a single item. Treating them as discrete is a conceptual error. At the user level, there is only a single item; the second media resource is an implementation detail.

Similarly, we are proposing accessibility mechanisms whereby a video can incorporate additional external resources as accessibility affordances. Would you argue that the <track> element needs alt, so that you can have alt text for a timed text track? Should there be separate alt text for the sign language version of a resource?

I don't think that makes sense. You have to think about what is actually getting presented in user terms, not get caught up on how many separate files are used to create that experience.

> The change Proposal I am working on now approaches the issue from the fact
> that we have 2 assets, a video and an image (<poster>), and that textual
> fallback for either should exist independent of the other. It builds on an
> existing pattern (the <video> element contains children elements, <src>,
> <track>, and so <poster>) which also has an eye/thought towards how to
> 'teach' authors about this - treating both assets as the discrete assets
> they are is an easy concept that most seem to grasp - at least when I
> discuss this with mainstream authors around here. It might feel like an
> overly literal-minded approach to engineers such as yourself, but when
> trying to teach non-professionals literal is not always a bad thing. 

I am thinking about this from a Human Interface point of view, not an engineer point of view. From the HI perspective, the user sees and interacts with one object on the screen. The fact that it is composed out of multiple files is an implementation detail. Applying the rule that every visual resource must have a textual equivalent to this implementation detail is a literal-minded engineering approach. We should be using an HI approach instead and thinking about the user-level concepts. It is bad HI to present a video and its poster image as if they are separate things, whether in the mainstream UI or through assistive technologies. They should consistently be presented as a single object.

Note also: asking users what features they want is a notoriously flawed approach to HI design.

Regards,
Maciej

Received on Friday, 3 December 2010 22:01:56 UTC