- From: Robert J Burns <rob@robburns.com>
- Date: Fri, 12 Sep 2008 21:17:35 +0200
- To: Dave Singer <singer@apple.com>
- Cc: HTML WG <public-html@w3.org>, W3C WAI-XTECH <wai-xtech@w3.org>
Hi Dave,
On Sep 4, 2008, at 12:13 AM, Dave Singer wrote:
> NOTE: Please be careful with replies here. Because the subject
> alas touches on accessibility, HTML, and CSS I have included all
> those groups (I hope), and also BCC'd WhatWG. If you're in WhatWG,
> please note that the discussion here started on public-html and so I
> am encouraging it to stay there.
>
> We've actually been thinking about the framework for accessibility
> of media elements in HTML5. Note that this is rather different from
> discussing (say) caption formats or the like. I've attached a
> 'thought piece' on the subject, which attempts to lay out some of
> the needs as we see them, and also proposes a way ahead.
>
> Comments gratefully received; this is an important subject, yet
> subtle. Good accessibility is quite tricky. If the spec doesn't
> provide the right framework, or it's unworkable from the point of
> view of authors or users, you fail, no matter how good your
> intentions...
Thanks for introducing this discussion. You've obviously put much
thought into the issues and the WG owes you a debt for doing so. I
agree with much of what you wrote, so here I'm only focussing on minor
points of contention and contributing my own thoughts to shape the
discussion.
First in addition to using an expanded conception of media queries to
shape the selection of resources, I think we should also encourage
interactive UAs to provide a mechanism for the user to override those
selections. In other words the UA should translate the media queries,
codec information, content type data and the title attribute for the
source element into localized descriptions allowing the user to
override the default selection. In this way if for example, the audio
description is poorly done and a distraction getting in the way, the
user can switch to the non-audio-description resource. Likewise for
language subtitles, the user might find the need to change the
selection away from the default after the fact.
Second, I think the alt attribute should be unnecessary for these
elements. The alt attribute is necessary for the IMG element only
because it needs to be a void element for the text/html serialization.
Otherwise the contents of the element serve as a much better container
for the alt text replacement. As far as I can tell, no one has
presented any reasons for not using the video and audio element's
contents in this way (for last resort fallback when the other
accessibility/univerality features of the resources themselves fail).
Also, several examples from Henri and others have demonstrated that
not using the element contents for alt text encourages the detrimental
use of the element's contents for taunting fallback like: "why don't
you get a real browser that supports HTML5". We don't want to
encourage this type of authoring.
Third, for long descriptions, transcripts (with action / stage
direction) and a priori scripts (also with action / stage direction),
the longdesc attribute might prove useful. This allows authors to
reference these highly specific text equivalents in a semantically
well-defined location keeping alt text equivalents separate from these
typically more verbose text equivalents. It may be beneficial to add a
new attribute (or attributes) to distinguish these from the longdesc
attribute. On HTML4All, Philip suggested adding a new attribute with a
new value syntax such as description='URI(mediadescriptions/
description.html)' or description='A lion roaring'. We could even
introduce such a syntax for longdesc as an example of 'paving the
cowpaths'. Other attributes could be 'transcript', 'script', etc.
Alternately, these attributes could be added as child elements of the
video and audio elements or even referenced from separate 'source'
elements in the ordered list of source elements (especially if we
recommend UAs provide UI for user override selection among source
elements).
Finally, your discussion introduction and some of the other comments
made recently on this topic raise the question for me whether HTML5
should strive to include more flexible authoring of video and audio
content that does not rely exclusively on the capabilities of the
various container formats for these alternate tracks. In other words
the various video tracks, audio tracks, subtitle tracks, caption
tracks, etc may be all handled as a single file from the HTML
document's perspective. In that case the HTML5 specification does not
really need to concern itself with defining much else. However we
could take the extra steps to facilitate more distributed and
decentralized authoring of content by allowing each video, audio or
source element to also reference separate tracks for client-side
muxing. In this way all of the source elements might all sharing the
same time indexing might be included along side-SMIL referenced audio
description, subtitles and captioning files potentially located even
on separate servers (perhaps with AcessControl policing this). This
use case would not replace the use of server-side delivery or pre-
muxed container formats delivered over RTSP or like protocols, but
would provide another flexible mechanism for distributed authoring of
content. For example, consider a site providing video telecasts of
events in the US. Now imagine another vendor in Minsk adds value to
the US video telecasts by simply adding Beoelorus subtitles, captions
and audio description through time-based resources located on their
own servers. Now the product can be streamed and viewed in Belarus in
a decentralized manner. Multiple resources from servers on opposite
sides of the Earth are combined client-side into a single stream for
local consumption. Such a mechanism would also address the use case
raised earlier on the list for a decentralized wiki style multimedia
enhancement and localization.
I'm thinking of perhaps something like this:
<video>
<source' media='<a media query>' >
<track src='avideofile' >
<track src='anaudiofile' >
<track src='acaptionfile' languages='<language metadata>' >
<track src='asubtitlefile'' languages='<language metadata>' >
<track src='anothersubtitlefile'' languages='<language metadata>' >
...
</source>
...
<source src='afile2' media='<a media query>' ></source>
<source src='afile3' media='<a media query>' ></source>
...
</video>
Alternatively, we could add SMIL as yet another extension format
supported within HTML5 or referenced for embedding by the src
attribute (or finally bite the bullet and add an IE8 / 'XML
namespaces' compatible namespace mechanism to HTML5 ).
Take care,
Rob
[1]: <http://lists.w3.org/Archives/Public/public-html/2008Sep/0118.html>
[2]: <http://lists.w3.org/Archives/Public/public-html/2008Sep/att-0118/html5-media-accedssibility.html
>
Received on Friday, 12 September 2008 19:18:21 UTC