Re: Requirements for external text alternatives for audio/video from Silvia Pfeiffer on 2010-03-25 (public-html-a11y@w3.org from March 2010)

From: Silvia Pfeiffer <silviapfeiffer1@gmail.com>
Date: Fri, 26 Mar 2010 10:59:04 +1100
To: Sean Hayes <Sean.Hayes@microsoft.com>
Cc: Eric Carlson <eric.carlson@apple.com>, Geoff Freed <geoff_freed@wgbh.org>, HTML Accessibility Task Force <public-html-a11y@w3.org>, Matt May <mattmay@adobe.com>, Philippe Le Hegaret <plh@w3.org>
Message-ID: <2c0e02831003251659n25ca80d5t86910cbc2f1f8803@mail.gmail.com>
Hi Sean, all,

Sorry, Sean, if I have mis-interpreted you.

Your statement now actually leads neatly onto a discussion that I have
been meaning to have next. Namely: how should the external text
support be implemented into HTML.

There has been a bit of a private discussion on this topic before and
also at http://lists.w3.org/Archives/Public/public-html-a11y/2009Nov/0120.html,
but I'd like to build on these discussions and put the ideas that were
mentioned forward. I may not have captured all the possible
implementation ideas and not all the possible aspects, so don't feel
limited by this list.

1. Expose it directly in the DOM on-the-fly
This would mean that inside the <track> element (or some other
element), we would render the piece(s) of text that is(are)
"currently" active together with their styling. This would be
<div>-like.

2. Render it in the shadow-DOM on-the-fly
Instead of rendering it directly into the DOM, it could be rendered
into the shadow DOM (as anonymous DOM elements), thus avoiding the
possibility to manipulate the content through JavaScript.

3. Expose it in an iframe-like construct on-the-fly
Instead of opening the content into the main document's DOM, a
<iframe>-like construction could be made. This addresses cross-site
security issues.

4. Expose complete in an iframe-like construct
The complete content of the external text track could be parsed into
HTML and exposed in an iframe-like construct. There will need to be
some introduction of timing elements.

5. Instead of mapping to HTML, introduce a new layout format
Similar to other elements like SVG, we could introduce a new layout
format that happens to share some of the layout code with HTML and
would also work <iframe>-like. This format would also contain the full
content of the external text file, not just the active part, but
display activation will be triggered by the video's timeline.

6. Instead of exposing in DOM, provide an attribute on <track>
Developers will need to be able to overrun provided styling and
placement in the external text file (if only to give video and
everything around it a corporate look). If we do not allow this
through CSS and the DOM, we could provide a property on <track> which
provides a standard XML form with the caption data, and to post an
event when it is time to display a caption. This will give a developer
everything they need to display captions in sync with the movie, *and*
it allows us to deal with the security violation when a script loaded
from one origin tries to access internal captions in a movie loaded
from another origin (throw an exception?). Note that this is only an
issue with internal captions, external captions can already be loaded
with XHR so we don't need to impose this restrictions on them.

Ultimately, I think this list won't have sufficient expertise to
discuss this topic, so I would like to take the topic of
implementation to the larger HTML5 WG. But since there are several
people with experience here, maybe we can get some initial ideas and
opinions, so the larger discussion can be more focused.

Fire away with your knowledge / opinions!

Cheers,
Silvia.


On Fri, Mar 26, 2010 at 9:23 AM, Sean Hayes <Sean.Hayes@microsoft.com> wrote:
> OK, to be clear I wasn't saying that mapping to CSS is the way things should be done, but only that it is one implementation option. I personally think that the text overlay should be considered as outside of the HTML space in exactly the same way as the video and audio streams are. Captions are a media essence, and have IP rights associated with them. If we integrate the display model into the HTML one, this potentially exposes the caption text to the viewer, and this approach won't work with a protected media file.
>
> Implementing TTML using a private HTML/CSS stack would be fine, but is as I say, just one implementation option.
>
>
> -----Original Message-----
> From: public-html-a11y-request@w3.org [mailto:public-html-a11y-request@w3.org] On Behalf Of Silvia Pfeiffer
> Sent: Thursday, March 25, 2010 8:50 PM
> To: Eric Carlson
> Cc: Geoff Freed; HTML Accessibility Task Force; Matt May; Philippe Le Hegaret
> Subject: Re: Requirements for external text alternatives for audio/video
>
> Hi Eric,
>
> On Fri, Mar 26, 2010 at 2:46 AM, Eric Carlson <eric.carlson@apple.com> wrote:
>>
>> On Mar 24, 2010, at 9:29 PM, Silvia Pfeiffer wrote:
>>
>> In summary - I would suggest keeping the File Format requirement at
>> http://www.w3.org/WAI/PF/HTML/wiki/Media_TextAssociations#File_Formats
>> with supporting both, srt and dfxp (or ttml as Sean clarified).
>>
>>   What DFXP profile are you suggesting we mandate?
>>   As Maciej noted [1], even the presentation profile requires XSL-FO.
>> Does anyone actually think it is reasonable to require a UA to
>> implement this substantial spec just to style captions?
>> eric
>> [1] -
>> http://lists.w3.org/Archives/Public/public-html-a11y/2010Mar/0103.html
>
> I believe right now all we need to mandate is the required part of the minimum profile
> (http://www.w3.org/TR/ttaf1-dfxp/#profile-dfxp-presentation) - it would conform with WCAG and be extensible to the other features that will certainly be mandated in the future. It looks to me that if other profiles are necessary beyond the ones already given in the TTML specification, these can be developed at a later stage.
>
> Right now we need to take care to find a way to deal with the style and layout specifications. I agree with Sean that this should be done not by implementing the TTML specifications directly, but by mapping them to existing HTML/CSS/JavaScript constructs.
>
>
> Philippe's demos is at http://www.w3.org/2009/02/ThisIsCoffee.html
> with the original TTML file at
> http://www.w3.org/2009/02/ThisIsCoffee61_captions.xml and the JavaScript that interprets it at http://www.w3.org/2008/12/dfxp-testsuite/web-framework/HTML5_player.js.
> The test suite is at
> http://www.w3.org/2008/12/dfxp-testsuite/web-framework/START.html
> which demonstrates support (or lack of support) for each TTML feature
> - choose the HTML5 player to see what mappings are already supported.
>
> These mappings that are currently done in JavaScript have to be extracted into a specification document. And we need to make sure when we implement support for captions that we can add the features parsed out of TTML into the HTML document.
>
> Cheers,
> Silvia.
>
>
>
Received on Thursday, 25 March 2010 23:59:57 UTC