Re: a11y TF CfC on resolution to support "Media Text Associations" change proposal for HTML issue 9 from Silvia Pfeiffer on 2010-04-05 (public-html-a11y@w3.org from April 2010)

From: Silvia Pfeiffer <silviapfeiffer1@gmail.com>
Date: Mon, 5 Apr 2010 11:26:56 +1000
To: Sean Hayes <Sean.Hayes@microsoft.com>
Cc: "public-html-a11y@w3.org" <public-html-a11y@w3.org>
Message-ID: <n2j2c0e02831004041826vaa712272q4047a677d98da16f@mail.gmail.com>
Hi Sean,

On Mon, Apr 5, 2010 at 2:49 AM, Sean Hayes <Sean.Hayes@microsoft.com> wrote:
> For 1) I would suggest we define the semantics of SRT, and any other format that has no formal timing model, in terms of TTML, (not that it has to be implemented that way of course). This will fix the interval issue and probably a whole bunch of other stuff too.

SP:
The text associations proposal has the ability to include other
time-aligned text formats. Things such as SmilText, SSA/ASS or LRC or
any other format could be used. You are assuming that every external
time-aligned text format that is not TTML has no formal timing model.
Also, you are assuming that every external time-aligned text format
can be mapped without loss to TTML. I do not believe either of these
are true. Thus, I don't think what you are proposing will work.

Thus, what is proposed is a basic set of rules towards which we
interpret the important parts of an external time-aligned text format.
Right now, the only rule that we have is that start and end time of a
timed text segment are interpreted as a semi-open interval:
[start,end) . Since this is also how it works in TTML, there is no
problem in interpreting TTML timed text segments.

As for SRT - there will eventually be an RFC that specifies its basic
rules, but everything that has described it this far also states the
same rule, so while it currently doesn't have a formally specified
timing model, it is de-facto specified.

So, I agree partially with what you are saying: namely that the timing
model for timed text segments as they are to be interpreted in HTML5
should be fixed to being an open interval of [start,end).

I do, however, disagree with the proposal to map everything to TTML,
since that may lead to lossy representation. Rather, I believe that
every timed text segments needs to be mapped to HTML with a start and
end time specification. Thus, anything but the start/end time can
already be interpreted by HTML, while the start/end time provides the
mapping to the timeline of the media resource.

In this way, text formats are only exposed to any loss created by
mapping from FORMAT X -> HTML, rather than a potential double loss by
mapping from FORMAT X -> TTML -> HTML.



> For 2 ". Maybe a CSS attribute such as "letterbox: include/exclude"?."  Making this an author choice is a good idea. But I wouldn't want to punt it to CSS. CSS controls the extent of the div, but this is slightly different.
>  Maybe we can have an additional attribute on track:
> @extent with values {media, container}
>
> Extent="media" means put the origin of the root rendering area for that track at the top left pixel in the video frame, and absent any information to the contrary in the TTML, makes the extent of it extend to the bottom right pixel in the video frame. The video frame may contain black bars, but these are not the same as bars applied by the UA
>
> Extent="container" means make the root rendering area coincide exactly with the layout div (as you had it before), this would cause the TTML to render over any black bars or padding applied by the UA.

SP:
I believe it's ultimately a styling issue, but I would also be
hesitant to have to wait for a CSS change.

So, that sounds like a good proposal to me. It would only apply to
video, I assume, since the @extent=media on an audio element is
non-existant space.

Also, I believe it would imply that @extent=container relates to the
calculated width x height of the media resource. We need to pay
attention here to the height of any default @controls that may be
present. These controls are overlayed onto the bottom part of the
video in all implementations and disappear if not used, so we need to
make sure to state something that stops the captions from colliding
with the controls.

Maybe we can make the container extent for video only be the video
height without the controls height but displayed above the controls.
The captions will then move up when the controls appear and back down
when they disappear. I've just uploaded a demo with Firefox that shows
the problem, see http://www.youtube.com/watch?v=Ojeh7ffhAk4 .



> 3) DAB has a number of possible text associations, including full web pages, but it seems they haven't thought of captions or subtitles yet.
>
> Provided we allow that <audio> can have a rendering area, then we can just give it the same default rendering as the <video> element, which is I believe 300x150px.if authors want they can get rid of it by making it 0 in either dimension using CSS, they won't be able to apply captions in that case; but perhaps they will have a transcript.

SP:
HTML5 doesn't define a rendering area for <audio> by default, since
<audio> is often used for background music on a Web page and thus is
not rendered at all. It will thus have to be the other way around: if
you want captions for your audio file, you have to give it a width and
height in CSS, which then defines the container extent.

We could, however, propose that if the audio resource has an enabled
text track, the audio element receives a default rendering area
similar to the <video> element. Though I would propose the area for
audio to be smaller - maybe 100x150px?


Regards,
Silvia.
Received on Monday, 5 April 2010 01:27:49 UTC