- From: Silvia Pfeiffer <silviapfeiffer1@gmail.com>
- Date: Fri, 31 Jul 2009 17:16:26 +1000
- To: public-html <public-html@w3.org>
Hi everybody, I sent this email to the WHATWG today, which has more details than what John forwarded the other day. I thought I should share it with this mailing list, too, to give everyone sufficient access to comment. Best Regards, Silvia. ---------- Forwarded message ---------- From: Silvia Pfeiffer <silviapfeiffer1@gmail.com> Date: Fri, Jul 31, 2009 at 1:36 PM Subject: Progress on video accessibility To: WHAT Working Group <whatwg@lists.whatwg.org> Hi, Several proposals have been made on this list in the past on how to approach accessibility for the HTML5 <video> element. I think the best way in which we can progress this is by doing an implementation of a spec, discussing it, improving the spec, rinse and repeat, which IIUC is the process WHATWG is using anyway. So, in this spirit, I would like to contribute a specification and implementation for how to attach out-of-band time-aligned text data to HTML5 <video> (and <audio>) elements. What I mean by out-of-band is that the text that is associated with the <video> is not available inside the binary video stream, but as a separate Web resource and needs to be retrieved before it can be displayed. This is a common use case and should be supported from within HTML5 in addition to supporting in-band time-aligned text. BTW: in-band time-aligned text for Ogg is something I want to experiment with next, since I would like us to get to an API that supports both, in-band and out-of-band, in the same way. But let me get straight to this current experiment: - the demo is at http://www.annodex.net/~silvia/itext/ . - the specification is at https://wiki.mozilla.org/Accessibility/HTML5_captions - a description and a first set of feedback that I have gathered is at https://wiki.mozilla.org/Accessibility/Experiment1_feedback Let me list some of the thoughts behind the proposal: * I can see a need for a multitude of different categories of time-aligned text that either already exist or will be developed in the future. The list that I can currently grasp is mentioned in the specification. While these text categories are rather diverse (e.g. karaoke text, ticker text, chapter markers, captions), they all share common properties and can be handled in fundamentally the same way by a browser. I therefore propose a common "itext" element (for "included text") to deal with associating such time-aligned text resources with <video> resources. * While the demo only shows how to apply <itext> to <video>, I believe it should be possible to also associate all of them with <audio>. An implementation experiment is necessary to examine the differences, which I believe to be mostly about display mechanisms. * I can also see a need for internationalisation of each text category. I.e. each text resource will come with an associated language for which it is valid and alternative language resources will be made available. This is why I am suggesting the @lang attribute. * Together, the @category and @lang attributes create a list of text tracks for the <video> for different display mechanisms. Assuming differing @lang tracks of the same @category are alternatives, while all @category tracks are allowed to appear at the same time, I developed a DVD-like menu for time-aligned text. You will find it in the demo under the "text bubble" button. * It is unclear, which of the given alternative text tracks in different languages should be displayed by default when loading an <itext> resource. A @default attribute has been added to the <itext> elements to allow for the Web content author to tell the browser which <itext> tracks he/she expects to be displayed by default. If the Web author does not specify such tracks, the display depends on the user agent (UA - generally the Web browser): for accessibility reasons, there should be a field that allows users to always turn display of certain <itext> categories on. Further, the UA is set to a default language and it is this default language that should be used to select which <itext> track should be displayed. * Since there is not a single file format that satisfies all categories of time-aligned text, I can see a need for <itext> to allow it to link to several different text formats. The only one used in the demo is SRT. I will also be looking at LRC and DFXP. I believe ultimately we will want to state which format a browser must support as baseline, but I also believe we need to experiment with them a bit more. I am not intending to define another new format at this stage. However, I have added a @type attribute to <itext> so we can specify which file format is to be expected at the end of the @src link. This is similar to the @type attribute of the <video> element. * Several of the current de-fact standard formats of time-aligned text are rather simple (including SRT and LRC) and do not include information about the charset that they are encoded in. For that reason, a @charset attribute was added to the <itext> specification. * Another typical feature of time-aligned text files is that they may be out of sync with the actual video file. For that purpose, a @delay attribute was suggested as an addition to the <itext> element. This has not been implemented into the demo. In the feedback to this proposal, a further "stretch" or "drift" attribute was suggested. * The idea for the display of the text categories is that we use existing browser display capabilities to do the display. Thus, I have defined for each text category a default display mechanism, i.e. a div into which it gets rendered into the DOM and a default CSS styling for the div and the text inside it. This also enables a Web developer to make changes to the default display simply through their own CSS styling. * The demo includes a textual audio description track, which allows visually impaired people to experience the video through use of their screenreader. The text is rendered into a div that has the @aria-live attribute set and thus generally works. I have used it successfully on my Mac with Firefox and the firevox plugin. I have heard from others who have used JAWS and NVDA successfully with it, though with some bugs, which are being looked into. * The demo generally works in all browsers that support the <video> tag, including Safari when XiphQT is installed. I am curious about comments to this proposal and suggestions for improvement. I have not yet developed an improved specification, but instead have collected feedback at https://wiki.mozilla.org/Accessibility/Experiment1_feedback#Thoughts_.2F_Feedback . Feel free to comment on the feedback, too - either here on the mailing list or in the wiki. Feedback has generally been encouraging, so I believe we are on the right track. Regards, Silvia. P.S. I may not have reached everyone who should know about this proposal, so feel free to forward the email to those people and invite them to contribute. Thanks.
Received on Friday, 31 July 2009 07:17:26 UTC