- From: Silvia Pfeiffer <silviapfeiffer1@gmail.com>
- Date: Thu, 26 Nov 2009 00:29:37 +1100
- To: Philip Jägenstedt <philipj@opera.com>
- Cc: HTML Accessibility Task Force <public-html-a11y@w3.org>
Hi Philip, all, See comments below inline. On Wed, Nov 25, 2009 at 11:24 PM, Philip Jägenstedt <philipj@opera.com> wrote: > > I agree that syncing separate video and audio files is a big challenge. I'd > prefer leaving this kind of complexity either to scripting or an external > manifest like SMIL. We have to at minimum deal with multi-track video and audio files inside HTML, since they can potentially expose accessibility data: audio descriptions (read by a human), sign language (signed by a person), and captions are the particular tracks I am concerned about. There is also always the needs for different recording angles, but let's leave that to javascript, where the whole media resource is exchanged. Similarly, when we deal with different devices, we can also exchange the complete media resource markup. So, focusing on a video with a + v + audio description + sign language track + caption track, we still need to expose these tracks to the Web browser to decide based on user preference settings whether to display them or not. This is on top of and beyond the <itext> proposals I have previously discussed. The Google accessibility experts wanted at least the in-line caption tracks exposed in declarative language. This is because otherwise you cannot build a menu of all available tracks without having to start downloading and decoding the file. With this in mind, I think we have to expose all of the tracks available in a file in declarative language. > Below I focus on the HTML-specific parts: > > Captions/subtitles... The main problem of reusing <source> is that it > doesn't work with the resource selection algorithm.[1] Yes, I have noticed that problem, too. The resource selection algorithm regards all of the <source> elements as alternatives to each other. > However, that > algorithm only considers direct children of the media element, so adding a > wrapping element would solve this problem and allow us to spec different > rules for selecting timed-text sources. Example: > > <video> > <source src="video.ogg" type="video/ogg"> > <source src="video.mp4" type="video/mp4"> > <overlay> > <source src="en.srt" lang="en-US"> > <source src="hans.srt" lang="zh-CN"> > </overlay> > </video> Yes, this works for external additional tracks. Maybe then we can add the internal tracks inside the source elements, something like this: <video> <source src="video.ogg" type="video/ogg"> <track id='v' role='video' ref='serialno:1505760010'> <track id='a' role='audio' lang='en' ref='serialno:0821695999'> <track id='ad' role='auddesc' lang='en' ref='serialno:1421614520'> <track id='s' role='sign' lang='ase' ref='serialno:1413244634'> <track id='cc' role='caption' lang='en' ref='serialno:1421849818'> </source> <source src="video.mp4" type="video/mp4"> <track id='v' role='video' ref='trackid:1'> <track id='a' role='audio' lang='en' ref='trackid:2'> </source> <overlay> <source src="en.srt" lang="en-US"> <source src="hans.srt" lang="zh-CN"> </overlay> </video> Note I have made the track reference explicit through introducing a new "ref" attribute which uses encapsulation format specific references to track identifiers. > We could possibly allow <overlay src="english.srt"></overlay> as a shorthand > when there is only one captions file, just like the video <video > src=""></video> shorthand. > > I'm suggesting <overlay> instead of e.g. <itext> because I have some special > behavior in mind: when no (usable) source is found in <overlay>, the content > of the element should be displayed overlayed on top of the video element as > if it were inside a CSS box of the same size as the video. This gives > authors a simple way to display overlay content such as custom controls and > complex "subtitles" like animated karaoke to work the same both in normal > rendering and in fullscreen mode. (I don't know what kind of CSS spec magic > would be needed to allow such rendering, but I don't believe overlaying the > content is very difficult implementation-wise.) > > Naturally, CSS is used to style the captions: > > <video src="video.ogg"> > <overlay src="en.srt" > style="font-size:2em;padding:1em;text-align:center"></overlay> > </video> > > If there is a use case, displaying several captions/subtitles at once could > be allowed as such: > > <video src="video.ogg"> > <overlay src="en.srt" class="centerTop"></overlay> > <overlay src="hans.srt" class="centerBottom"></overlay> > </video> Ah yes, that is replicating the hierarchical approach I took with itextlist / itext.[2] They could also be more generic text than just subtitles and captions - in particular textual audio descriptions have been confirmed at TPAC to be very useful indeed. > centerTop/centerBottom are appropriately defined in CSS. Those are almost like the default styling approaches I suggested for itextlist / itext.[2] There, I also assumed there was a display area as large as the video or actually just a little larger available to render the time-aligend text into. It's larger since sometimes it is better not to overlay stuff but to place it right next to the video, e.g. just above it (title-like) or just below it but visually part of the video window. > For what it's worth, it's easy to get this behavior (sans fullscreen) using > scripting today, simply by cloning/moving the overlay elements outside of > <vide> and positioning them on top using CSS. Even SRT retrieval (XHR), > decoding (RegExp) and syncing (timeupdate event) is easy enough to do. It's indeed how I implemented the demos [3]. E.g. http://www.annodex.net/~silvia/itext/elephant_no_skin_v2.html has divs defined just outside the video element, but styled to sit directly over the video. Is this something that we would need to declare explicitly into the DOM or would that be something that the browser can introduce at that position and expose to the DOM. Without the DOM exposure, there is no adaptive styling. > Comments? I think your ideas re CSS are great! I am as yet unsure how that can be solved in the browser, so any ideas are very much welcome. Cheers, Silvia. [2] https://wiki.mozilla.org/Accessibility/HTML5_captions_v2 [3] http://www.annodex.net/~silvia/itext/ > [1] > http://www.whatwg.org/specs/web-apps/current-work/multipage/video.html#concept-media-load-algorithm
Received on Wednesday, 25 November 2009 13:30:31 UTC