Re: a11y TF CfC on resolution to support "Media Text Associations" change proposal for HTML issue 9 from Philip Jägenstedt on 2010-04-06 (public-html-a11y@w3.org from April 2010)

From: Philip Jägenstedt <philipj@opera.com>
Date: Tue, 06 Apr 2010 11:11:32 +0800
To: "Sean Hayes" <Sean.Hayes@microsoft.com>, "Silvia Pfeiffer" <silviapfeiffer1@gmail.com>
Cc: "public-html-a11y@w3.org" <public-html-a11y@w3.org>
Message-ID: <op.vap0hcbtatwj1d@philip-pc.oslo.osa>
Long thread, 2 comments:

I think external captions should always be fitted to the entire <video>  
element, regardless of whether or not letterboxing would apply. If the  
captions can only be seen correctly at the size of the video, then just  
don't set width or height on the <video> element to have the size  
calculated automatically (no letterboxing can occur). I don't think we  
need to introduce any new attributes to handle this.

I think that browsers should render captions above the video but under the  
controls, so that controls can never be obscured. I'm not sure which spec  
(if any) needs to say this.

Philip

On Mon, 05 Apr 2010 09:26:56 +0800, Silvia Pfeiffer  
<silviapfeiffer1@gmail.com> wrote:

> Hi Sean,
>
> On Mon, Apr 5, 2010 at 2:49 AM, Sean Hayes <Sean.Hayes@microsoft.com>  
> wrote:
>> For 1) I would suggest we define the semantics of SRT, and any other  
>> format that has no formal timing model, in terms of TTML, (not that it  
>> has to be implemented that way of course). This will fix the interval  
>> issue and probably a whole bunch of other stuff too.
>
> SP:
> The text associations proposal has the ability to include other
> time-aligned text formats. Things such as SmilText, SSA/ASS or LRC or
> any other format could be used. You are assuming that every external
> time-aligned text format that is not TTML has no formal timing model.
> Also, you are assuming that every external time-aligned text format
> can be mapped without loss to TTML. I do not believe either of these
> are true. Thus, I don't think what you are proposing will work.
>
> Thus, what is proposed is a basic set of rules towards which we
> interpret the important parts of an external time-aligned text format.
> Right now, the only rule that we have is that start and end time of a
> timed text segment are interpreted as a semi-open interval:
> [start,end) . Since this is also how it works in TTML, there is no
> problem in interpreting TTML timed text segments.
>
> As for SRT - there will eventually be an RFC that specifies its basic
> rules, but everything that has described it this far also states the
> same rule, so while it currently doesn't have a formally specified
> timing model, it is de-facto specified.
>
> So, I agree partially with what you are saying: namely that the timing
> model for timed text segments as they are to be interpreted in HTML5
> should be fixed to being an open interval of [start,end).
>
> I do, however, disagree with the proposal to map everything to TTML,
> since that may lead to lossy representation. Rather, I believe that
> every timed text segments needs to be mapped to HTML with a start and
> end time specification. Thus, anything but the start/end time can
> already be interpreted by HTML, while the start/end time provides the
> mapping to the timeline of the media resource.
>
> In this way, text formats are only exposed to any loss created by
> mapping from FORMAT X -> HTML, rather than a potential double loss by
> mapping from FORMAT X -> TTML -> HTML.
>
>
>
>> For 2 ". Maybe a CSS attribute such as "letterbox: include/exclude"?."  
>>  Making this an author choice is a good idea. But I wouldn't want to  
>> punt it to CSS. CSS controls the extent of the div, but this is  
>> slightly different.
>>  Maybe we can have an additional attribute on track:
>> @extent with values {media, container}
>>
>> Extent="media" means put the origin of the root rendering area for that  
>> track at the top left pixel in the video frame, and absent any  
>> information to the contrary in the TTML, makes the extent of it extend  
>> to the bottom right pixel in the video frame. The video frame may  
>> contain black bars, but these are not the same as bars applied by the UA
>>
>> Extent="container" means make the root rendering area coincide exactly  
>> with the layout div (as you had it before), this would cause the TTML  
>> to render over any black bars or padding applied by the UA.
>
> SP:
> I believe it's ultimately a styling issue, but I would also be
> hesitant to have to wait for a CSS change.
>
> So, that sounds like a good proposal to me. It would only apply to
> video, I assume, since the @extent=media on an audio element is
> non-existant space.
>
> Also, I believe it would imply that @extent=container relates to the
> calculated width x height of the media resource. We need to pay
> attention here to the height of any default @controls that may be
> present. These controls are overlayed onto the bottom part of the
> video in all implementations and disappear if not used, so we need to
> make sure to state something that stops the captions from colliding
> with the controls.
>
> Maybe we can make the container extent for video only be the video
> height without the controls height but displayed above the controls.
> The captions will then move up when the controls appear and back down
> when they disappear. I've just uploaded a demo with Firefox that shows
> the problem, see http://www.youtube.com/watch?v=Ojeh7ffhAk4 .
>
>
>
>> 3) DAB has a number of possible text associations, including full web  
>> pages, but it seems they haven't thought of captions or subtitles yet.
>>
>> Provided we allow that <audio> can have a rendering area, then we can  
>> just give it the same default rendering as the <video> element, which  
>> is I believe 300x150px.if authors want they can get rid of it by making  
>> it 0 in either dimension using CSS, they won't be able to apply  
>> captions in that case; but perhaps they will have a transcript.
>
> SP:
> HTML5 doesn't define a rendering area for <audio> by default, since
> <audio> is often used for background music on a Web page and thus is
> not rendered at all. It will thus have to be the other way around: if
> you want captions for your audio file, you have to give it a width and
> height in CSS, which then defines the container extent.
>
> We could, however, propose that if the audio resource has an enabled
> text track, the audio element receives a default rendering area
> similar to the <video> element. Though I would propose the area for
> audio to be smaller - maybe 100x150px?
>
>
> Regards,
> Silvia.
>


-- 
Philip Jägenstedt
Core Developer
Opera Software
Received on Tuesday, 6 April 2010 03:12:21 UTC