Re: a11y TF CfC on resolution to support "Media Text Associations" change proposal for HTML issue 9 from Philip Jägenstedt on 2010-04-06 (public-html-a11y@w3.org from April 2010)

From: Philip Jägenstedt <philipj@opera.com>
Date: Tue, 06 Apr 2010 14:36:05 +0800
To: "Eric Carlson" <eric.carlson@apple.com>, "Silvia Pfeiffer" <silviapfeiffer1@gmail.com>
Cc: "Sean Hayes" <Sean.Hayes@microsoft.com>, "public-html-a11y@w3.org" <public-html-a11y@w3.org>
Message-ID: <op.vap9yfj8atwj1d@philip-pc.oslo.osa>
Eric's interpretation is what I intended. I don't think it's necessary to  
do anything special (like moving) to keep the captions readable when the  
controls are visible, as the controls will go away as soon as you're  
actually watching the video. In the worst case, you can disable the  
controls from the context menu.

Philip

On Tue, 06 Apr 2010 13:31:01 +0800, Silvia Pfeiffer  
<silviapfeiffer1@gmail.com> wrote:

> Does that mean that the captions move above the controls when the
> controls become visible? Or do you mean that the controls are always
> visible when there are captions? I think that both, the controls and
> the captions need to be visible when the controls are visible, but the
> captions can move down into the controls area when the controls are
> invisible. Just like in http://www.youtube.com/watch?v=Ojeh7ffhAk4 .
>
> Cheers,
> Silvia.
>
> On Tue, Apr 6, 2010 at 3:22 PM, Eric Carlson <eric.carlson@apple.com>  
> wrote:
>> I took it to mean that the render order is always 1) the video frame,  
>> 2) the captions, and 3) the controls. This will ensure that the  
>> controls are always visible, even if the captions extend into the  
>> controls region. I agree that this is the correct behavior.
>>
>> eric
>>
>> On Apr 6, 2010, at 6:08 AM, Silvia Pfeiffer <silviapfeiffer1@gmail.com>  
>> wrote:
>>
>>> What do you mean by "rendering above the video but below the  
>>> controls"? Do you mean "overlaying" them on the video in the same spot  
>>> as the controls?
>>>
>>> I would think the way that the demo in Firefox works that I screencast  
>>> and posted to YouTube (link posted earlier) is most intuitive.
>>>
>>> Regards,
>>> Silvia.
>>>
>>> Sent from my iPhone
>>>
>>> On 06/04/2010, at 1:11 PM, Philip Jägenstedt <philipj@opera.com> wrote:
>>>
>>>> Long thread, 2 comments:
>>>>
>>>> I think external captions should always be fitted to the entire  
>>>> <video> element, regardless of whether or not letterboxing would  
>>>> apply. If the captions can only be seen correctly at the size of the  
>>>> video, then just don't set width or height on the <video> element to  
>>>> have the size calculated automatically (no letterboxing can occur). I  
>>>> don't think we need to introduce any new attributes to handle this.
>>>>
>>>> I think that browsers should render captions above the video but  
>>>> under the controls, so that controls can never be obscured. I'm not  
>>>> sure which spec (if any) needs to say this.
>>>>
>>>> Philip
>>>>
>>>> On Mon, 05 Apr 2010 09:26:56 +0800, Silvia Pfeiffer  
>>>> <silviapfeiffer1@gmail.com> wrote:
>>>>
>>>>> Hi Sean,
>>>>>
>>>>> On Mon, Apr 5, 2010 at 2:49 AM, Sean Hayes  
>>>>> <Sean.Hayes@microsoft.com> wrote:
>>>>>> For 1) I would suggest we define the semantics of SRT, and any  
>>>>>> other format that has no formal timing model, in terms of TTML,  
>>>>>> (not that it has to be implemented that way of course). This will  
>>>>>> fix the interval issue and probably a whole bunch of other stuff  
>>>>>> too.
>>>>>
>>>>> SP:
>>>>> The text associations proposal has the ability to include other
>>>>> time-aligned text formats. Things such as SmilText, SSA/ASS or LRC or
>>>>> any other format could be used. You are assuming that every external
>>>>> time-aligned text format that is not TTML has no formal timing model.
>>>>> Also, you are assuming that every external time-aligned text format
>>>>> can be mapped without loss to TTML. I do not believe either of these
>>>>> are true. Thus, I don't think what you are proposing will work.
>>>>>
>>>>> Thus, what is proposed is a basic set of rules towards which we
>>>>> interpret the important parts of an external time-aligned text  
>>>>> format.
>>>>> Right now, the only rule that we have is that start and end time of a
>>>>> timed text segment are interpreted as a semi-open interval:
>>>>> [start,end) . Since this is also how it works in TTML, there is no
>>>>> problem in interpreting TTML timed text segments.
>>>>>
>>>>> As for SRT - there will eventually be an RFC that specifies its basic
>>>>> rules, but everything that has described it this far also states the
>>>>> same rule, so while it currently doesn't have a formally specified
>>>>> timing model, it is de-facto specified.
>>>>>
>>>>> So, I agree partially with what you are saying: namely that the  
>>>>> timing
>>>>> model for timed text segments as they are to be interpreted in HTML5
>>>>> should be fixed to being an open interval of [start,end).
>>>>>
>>>>> I do, however, disagree with the proposal to map everything to TTML,
>>>>> since that may lead to lossy representation. Rather, I believe that
>>>>> every timed text segments needs to be mapped to HTML with a start and
>>>>> end time specification. Thus, anything but the start/end time can
>>>>> already be interpreted by HTML, while the start/end time provides the
>>>>> mapping to the timeline of the media resource.
>>>>>
>>>>> In this way, text formats are only exposed to any loss created by
>>>>> mapping from FORMAT X -> HTML, rather than a potential double loss by
>>>>> mapping from FORMAT X -> TTML -> HTML.
>>>>>
>>>>>
>>>>>
>>>>>> For 2 ". Maybe a CSS attribute such as "letterbox:  
>>>>>> include/exclude"?."  Making this an author choice is a good idea.  
>>>>>> But I wouldn't want to punt it to CSS. CSS controls the extent of  
>>>>>> the div, but this is slightly different.
>>>>>> Maybe we can have an additional attribute on track:
>>>>>> @extent with values {media, container}
>>>>>>
>>>>>> Extent="media" means put the origin of the root rendering area for  
>>>>>> that track at the top left pixel in the video frame, and absent any  
>>>>>> information to the contrary in the TTML, makes the extent of it  
>>>>>> extend to the bottom right pixel in the video frame. The video  
>>>>>> frame may contain black bars, but these are not the same as bars  
>>>>>> applied by the UA
>>>>>>
>>>>>> Extent="container" means make the root rendering area coincide  
>>>>>> exactly with the layout div (as you had it before), this would  
>>>>>> cause the TTML to render over any black bars or padding applied by  
>>>>>> the UA.
>>>>>
>>>>> SP:
>>>>> I believe it's ultimately a styling issue, but I would also be
>>>>> hesitant to have to wait for a CSS change.
>>>>>
>>>>> So, that sounds like a good proposal to me. It would only apply to
>>>>> video, I assume, since the @extent=media on an audio element is
>>>>> non-existant space.
>>>>>
>>>>> Also, I believe it would imply that @extent=container relates to the
>>>>> calculated width x height of the media resource. We need to pay
>>>>> attention here to the height of any default @controls that may be
>>>>> present. These controls are overlayed onto the bottom part of the
>>>>> video in all implementations and disappear if not used, so we need to
>>>>> make sure to state something that stops the captions from colliding
>>>>> with the controls.
>>>>>
>>>>> Maybe we can make the container extent for video only be the video
>>>>> height without the controls height but displayed above the controls.
>>>>> The captions will then move up when the controls appear and back down
>>>>> when they disappear. I've just uploaded a demo with Firefox that  
>>>>> shows
>>>>> the problem, see http://www.youtube.com/watch?v=Ojeh7ffhAk4 .
>>>>>
>>>>>
>>>>>
>>>>>> 3) DAB has a number of possible text associations, including full  
>>>>>> web pages, but it seems they haven't thought of captions or  
>>>>>> subtitles yet.
>>>>>>
>>>>>> Provided we allow that <audio> can have a rendering area, then we  
>>>>>> can just give it the same default rendering as the <video> element,  
>>>>>> which is I believe 300x150px.if authors want they can get rid of it  
>>>>>> by making it 0 in either dimension using CSS, they won't be able to  
>>>>>> apply captions in that case; but perhaps they will have a  
>>>>>> transcript.
>>>>>
>>>>> SP:
>>>>> HTML5 doesn't define a rendering area for <audio> by default, since
>>>>> <audio> is often used for background music on a Web page and thus is
>>>>> not rendered at all. It will thus have to be the other way around: if
>>>>> you want captions for your audio file, you have to give it a width  
>>>>> and
>>>>> height in CSS, which then defines the container extent.
>>>>>
>>>>> We could, however, propose that if the audio resource has an enabled
>>>>> text track, the audio element receives a default rendering area
>>>>> similar to the <video> element. Though I would propose the area for
>>>>> audio to be smaller - maybe 100x150px?
>>>>>
>>>>>
>>>>> Regards,
>>>>> Silvia.
>>>>>
>>>>
>>>>
>>>> --
>>>> Philip Jägenstedt
>>>> Core Developer
>>>> Opera Software
>>>
>>
>


-- 
Philip Jägenstedt
Core Developer
Opera Software
Received on Tuesday, 6 April 2010 06:36:54 UTC