RE: a11y TF CfC on resolution to support "Media Text Associations" change proposal for HTML issue 9 from Sean Hayes on 2010-04-05 (public-html-a11y@w3.org from April 2010)

From: Sean Hayes <Sean.Hayes@microsoft.com>
Date: Mon, 5 Apr 2010 11:37:29 +0000
To: Silvia Pfeiffer <silviapfeiffer1@gmail.com>
CC: "public-html-a11y@w3.org" <public-html-a11y@w3.org>
Message-ID: <8DEFC0D8B72E054E97DC307774FE4B911A4C97E4@DB3EX14MBXC301.europe.corp.microsoft.c>
 I said " and any other format that has no formal timing model "; clearly smilText does come with its own model. It's possible that there are text formats that would not map to TTML, especially complex ones, but they probably would come under the above too. What I was really after was the myriad of simple formats which for the most part do map to TTML. In addition, we really only need to talk about this for formats which are mentioned in the spec. But if you want to replicate the necessary rules in the HTML spec, I'm OK with that; I just think it will be less efficient as you will need to explain your terms. Even TTML for the most part defers to SMIL. 

Yes "container" would have to exclude any control area, including dynamic ones, I've done that trick with Silverlight too. 

I was assuming extent="media" would be an illegal/inappropriate value for <audio>, rather than not having the attribute apply in case we come up with other values in the future. But as I'd like to make media the default perhaps we could state something like: "if the media has no intrinsic size, then specifying the 'media' value for @extent results in the same area as 'container'".

"We could, however, propose that if the audio resource has an enabled text track, the audio element receives a default rendering area similar to the <video> element. Though I would propose the area for audio to be smaller - maybe 100x150px?"
- works for me.

-----Original Message-----
From: Silvia Pfeiffer [mailto:silviapfeiffer1@gmail.com] 
Sent: Monday, April 05, 2010 2:27 AM
To: Sean Hayes
Cc: public-html-a11y@w3.org
Subject: Re: a11y TF CfC on resolution to support "Media Text Associations" change proposal for HTML issue 9

Hi Sean,

On Mon, Apr 5, 2010 at 2:49 AM, Sean Hayes <Sean.Hayes@microsoft.com> wrote:
> For 1) I would suggest we define the semantics of SRT, and any other format that has no formal timing model, in terms of TTML, (not that it has to be implemented that way of course). This will fix the interval issue and probably a whole bunch of other stuff too.

SP:
The text associations proposal has the ability to include other time-aligned text formats. Things such as SmilText, SSA/ASS or LRC or any other format could be used. You are assuming that every external time-aligned text format that is not TTML has no formal timing model.
Also, you are assuming that every external time-aligned text format can be mapped without loss to TTML. I do not believe either of these are true. Thus, I don't think what you are proposing will work.

Thus, what is proposed is a basic set of rules towards which we interpret the important parts of an external time-aligned text format.
Right now, the only rule that we have is that start and end time of a timed text segment are interpreted as a semi-open interval:
[start,end) . Since this is also how it works in TTML, there is no problem in interpreting TTML timed text segments.

As for SRT - there will eventually be an RFC that specifies its basic rules, but everything that has described it this far also states the same rule, so while it currently doesn't have a formally specified timing model, it is de-facto specified.

So, I agree partially with what you are saying: namely that the timing model for timed text segments as they are to be interpreted in HTML5 should be fixed to being an open interval of [start,end).

I do, however, disagree with the proposal to map everything to TTML, since that may lead to lossy representation. Rather, I believe that every timed text segments needs to be mapped to HTML with a start and end time specification. Thus, anything but the start/end time can already be interpreted by HTML, while the start/end time provides the mapping to the timeline of the media resource.

In this way, text formats are only exposed to any loss created by mapping from FORMAT X -> HTML, rather than a potential double loss by mapping from FORMAT X -> TTML -> HTML.



> For 2 ". Maybe a CSS attribute such as "letterbox: include/exclude"?."  Making this an author choice is a good idea. But I wouldn't want to punt it to CSS. CSS controls the extent of the div, but this is slightly different.
>  Maybe we can have an additional attribute on track:
> @extent with values {media, container}
>
> Extent="media" means put the origin of the root rendering area for 
> that track at the top left pixel in the video frame, and absent any 
> information to the contrary in the TTML, makes the extent of it extend 
> to the bottom right pixel in the video frame. The video frame may 
> contain black bars, but these are not the same as bars applied by the 
> UA
>
> Extent="container" means make the root rendering area coincide exactly with the layout div (as you had it before), this would cause the TTML to render over any black bars or padding applied by the UA.

SP:
I believe it's ultimately a styling issue, but I would also be hesitant to have to wait for a CSS change.

So, that sounds like a good proposal to me. It would only apply to video, I assume, since the @extent=media on an audio element is non-existant space.

Also, I believe it would imply that @extent=container relates to the calculated width x height of the media resource. We need to pay attention here to the height of any default @controls that may be present. These controls are overlayed onto the bottom part of the video in all implementations and disappear if not used, so we need to make sure to state something that stops the captions from colliding with the controls.

Maybe we can make the container extent for video only be the video height without the controls height but displayed above the controls.
The captions will then move up when the controls appear and back down when they disappear. I've just uploaded a demo with Firefox that shows the problem, see http://www.youtube.com/watch?v=Ojeh7ffhAk4 .



> 3) DAB has a number of possible text associations, including full web pages, but it seems they haven't thought of captions or subtitles yet.
>
> Provided we allow that <audio> can have a rendering area, then we can just give it the same default rendering as the <video> element, which is I believe 300x150px.if authors want they can get rid of it by making it 0 in either dimension using CSS, they won't be able to apply captions in that case; but perhaps they will have a transcript.

SP:
HTML5 doesn't define a rendering area for <audio> by default, since <audio> is often used for background music on a Web page and thus is not rendered at all. It will thus have to be the other way around: if you want captions for your audio file, you have to give it a width and height in CSS, which then defines the container extent.

We could, however, propose that if the audio resource has an enabled text track, the audio element receives a default rendering area similar to the <video> element. Though I would propose the area for audio to be smaller - maybe 100x150px?


Regards,
Silvia.
Received on Monday, 5 April 2010 11:38:08 UTC