- From: Silvia Pfeiffer <silviapfeiffer1@gmail.com>
- Date: Wed, 8 Jun 2011 16:46:10 +1000
- To: Sean Hayes <Sean.Hayes@microsoft.com>
- Cc: "public-html-a11y@w3.org" <public-html-a11y@w3.org>
On Wed, Jun 8, 2011 at 9:30 AM, Sean Hayes <Sean.Hayes@microsoft.com> wrote: > "Therefore, you cannot rely on the video making progress at the same time as the TTS engine ". > Possibly not, however in the absence of trick play (which I think would have to cancel any descriptions), one can probably assume the video won't go *faster* than expected. Therefore if you set an internal handler for the assumed end time, then even if the video hasn't reached that point yet because it stalled, no real harm is done issuing a pause. The TTS engine might go slower than expected (because it, too, may be starved of CPU) and therefore the effect of the video going slower than expected would still happen. > " I do not know how to inform the browser or a JS when the screen reader has finished reading text in a cross-browser compatible way. " > Do we need to is my point? I think we do, since the TTS engine and the video player are two processes that run asynchronously and therefore synchronisation is necessary. > " Descriptions delivered as audio do not come in the TextTrack. They come in the multitrack API. " > That's arguing we shouldn't change the design because the design is wrong. To the end user they are both descriptions and serve the same purpose; the user doesn't care what markup tag caused them to come into existence. You're assuming that the current design is wrong. Let's analyse that before making such an assumption. When we deal with text descriptions, they have to be voiced somehow. This requires a TTS somewhere in the pipeline. When we deal with audio descriptions, they come directly from the video element and are thus a native part of the browser and not handed through to a TTS. I find it hard to see that it is possible to expose these two fundamentally different types of content to the user in the same way. In particular: audio descriptions will go in sync with the video and there is no need to pause the video to display them, while text descriptions create the need for extensions of the timeline and the pausing behaviour. I think they are inherently different and trying to fool the user into thinking that they are identical will just lead to problems. > "So you want them displayed as well as the captions? Always or only when they are also read out? What screen real estate are you expecting to use? Can you provide an example as a use case?" > They would be presented as both captions and descriptions, so they are displayed when the user selects them in the caption menu and for their allotted duration. I'm expecting the author to determine the screen real estate exactly as they do for other captions. I demoed an example at the f2f if you recall. I'll check tomorrow whether it's still online. Does selecting them in the captions menu automatically mean they have to be shown on the screen? We have to be careful about the consequences: we are just introducing two new state making it 4 states that a audio description track can be in: off, on and voiced, on and visible, on and visible and voiced. A single entry in a menu will now not suffice any longer to select an audio description track. This single change creates heaps of new complexity. If an author really wants to display the text descriptions as text, right now they would use some javascript to do so. Is that not sufficient? Should we not wait and see how large the need for such a feature is rather than jumping to conclusions on a feature that doesn't exist anywhere else yet? > "Screen readers provide the interface to the Braille devices." > Screen readers are certainly the primary providers of text to a Braille device, but it's basically an output port; other processes, like the media subsystem, could potentially use it too. I don't think it's a given that we'd assume descriptions (which as you say aren't generally on the screen, and aren't in the DOM), should actually be read by a screen reader. They are in the shadow dom and there is a JavaScript API for them. They exist more in the page than other external content such as e.g. picture, audio or video data. > I am still not 100% on board with the idea that text track descriptions should be relying on the presence of a screen reader, since a SR is going to be doing a lot of other things related to navigation on the page. I'm not sure SR designers have even considered this use case. Probably not yet. I am starting discussions on the IA2 mailing list to see what people are thinking about it, since it would be there where the most impact would be felt. The issue is that SR and video playback have to interact constructively. You can't just have them as completely separate modules. The screen reader has control over an audio description track of the video element - why should it not have control over a text description track, too? Also, right now screen readers are the only TTS engine we get for Web pages, so if we don't make use of them for text descriptions, we can't do anything with text descriptions. What alternative do we have? Cheers, Silvia.
Received on Wednesday, 8 June 2011 06:46:58 UTC