Re: timing model of the media resource in HTML5

On Sat, 28 Nov 2009 19:42:15 +0100, Maciej Stachowiak <mjs@apple.com>  
wrote:

>
> On Nov 25, 2009, at 2:49 PM, Philip Jägenstedt wrote:
>
>> On Wed, 25 Nov 2009 22:49:40 +0100, Eric Carlson <eric.carlson@apple.com 
>> > wrote:
>>
>>>
>>> On Nov 25, 2009, at 12:02 PM, Philip Jägenstedt wrote:
>>>
>>>> On Wed, 25 Nov 2009 18:43:39 +0100, Eric Carlson  
>>>> <eric.carlson@apple.com> wrote:
>>>>
>>>>>
>>>>>  I think <overlay> should be used for internal subtitle and/or  
>>>>> closed caption tracks as well. Further, I think that we will want  
>>>>> them to "just work" so a UA should create an <overlay> element if  
>>>>> the markup doesn't have one and it finds that a file has internal  
>>>>> captions/subtitles:
>>>>>
>>>>>    <video src='my-captioned-movie'> </video>
>>>>>
>>>>
>>>> Yes, that sounds good. One issue is how to style such an implicit  
>>>> <overlay>. Should one actually include an <overlay> in the markup and  
>>>> somehow indicate that it can/should be used to render in-band  
>>>> subtitles from the resource?
>>>>
>>>> <video src="my-captioned-movie">
>>>> <caption style="font-weight:bold" magic-attribute></caption>
>>>> </video>
>>>>
>>>> Not awesome. Perhaps a new CSS pseudo-selector could be used? Other  
>>>> ideas?
>>>>
>>>   Actually I was imagining that *all* subtitles and captions, in-band  
>>> and external alike, would be rendered into an <overlay>. If the markup  
>>> doesn't includes an <overlay> element the UA would actually insert one  
>>> into the DOM, as is done now for other missing elements (eg. tbody,  
>>> etc). This way the default style could be specified in the user agent  
>>> style sheet and the author could override as they wish.
>>>
>>
>> Right, all subtitles/captions should be rendered in <overlay>  
>> regardless of origin. If the parser automatically inserts an <overlay>  
>> element when there is none, what about the case where there is an  
>> <overlay> used to show custom controls? I imagined the need for a magic  
>> attribute/CSS selector/something to point out the correct <overlay> in  
>> cases like this. Possibly a magic src attribute? In any case, these are  
>> small issues that I'm sure could be sorted out if <overlay> is  
>> implemented.
>>
>>>   Speaking of author overrides, another issue we need to deal with is  
>>> authors that wish to handle captions themselves. Posting an event when  
>>> a new caption needs to be displayed seems logical enough, but how do  
>>> we provide access to the caption data in JavaScript?
>>>
>>
>> Here I think we should do something similar to cue ranges as has been  
>> discussed before in various places. A new event type would allow us to  
>> add some data, e.g.
>>
>> interface CueRangeEvent : Event {
>>  readonly attribute double startTime;
>>  readonly attribute double endTime;
>>  readonly attribute DOMString text;
>> };
>>
>> We would need to bring back addCueRange with some modifications.
>>
>> v.addCueRange(10 /* start */, 12 /* end */, "Hello");
>> v.addEventListener('cuerangeenter', function(e) {  
>> e.target.querySelector('overlay').textContent = e.text; }, false);
>> v.addEventListener('cuerangeleave', function(e) {  
>> e.target.querySelector('overlay').textContent = ''; }, false);
>>
>> You get the idea even if the above doesn't have the perfect  
>> interface/method/event names. Something along these lines should make  
>> it possible to handle in-band, external and script-created captions in  
>> a quite uniform fashion, as well as provide for whatever use cases the  
>> old cue range API had.
>
> This interface works ok for the specific case of popping up some text,  
> but it seems like it would be awkward for anything more complicated,  
> since there is only a single event and set of handlers. What I would  
> suggest is the declarative cue range idea that was suggested on the  
> whatwg list a while back:
>
> <video>
>      <source type="video/mp4" src="video.m4v">
>      <timerange start="10" end="12" onrangeenter="enterRange1()"  
> onrangeleave="leaveRange1()">
> </video>
>
> This makes it really easy to have different handlers per cue range  
> without having to express that difference as a string. It also makes it  
> simpler to use cue ranges for two orthogonal purposes.
>
> addCureRange() could just be a shortcut for adding such a <range>  
> element:
>
> var range = v.addCueRange(10, 12);
> range.addEventListener("rangeenter", function(e) {  
> e.target.querySelector('overlay').textContent = "Hello"; }, false);
> range.addEventListener("rangeleave", function(e) {  
> e.target.querySelector('overlay').textContent = ''; }, false);

If addCueRange does nothing but insert elements in the DOM then we don't  
need it at all, simply let script authors write it themselves if they need  
a shortcut. It has been by working assumption that external SRT file  
should fire the same events, so not all ranges are represented as an  
element in the DOM. addCueRange would then be a way to add such not-in-DOM  
ranges.

In <https://wiki.mozilla.org/Accessibility/Experiment1_feedback> I  
suggested a MediaTimeRange interface. Remxing that somewhat:

interface MediaTimeRange {
   attribute double start;
   attribute double end;
   //attribute DOMString text;
   // FIXME: how to represent the content?
}

interface MediaTimeRangeList {
   // automatically sorted by increasing time
   readonly attribute unsigned long length;
   getter DOMString item(in unsigned long index);
   void add(in MediaTimeRange range);
   void remove(in MediaTimeRange range);
   // these last two look suspiciously similar to appendChild and  
removeChild
}

interface HTMLItextlistOverlayWhateverElement : HTMLElement {
   attribute MediaTimeRangeList ranges;
}

The problem I was trying to solve is that of representing the time ranges  
uniformly regardless of their source. External subtitles can be accessed  
and modified via MediaTimeRangeList. <timerange> gets mapped into a  
MediaTimeRangeList. A MediaTimeRangeList can be constructed by scripts.

However, make note of the FIXME. Because not all external subtitle formats  
can be represented as plain text, there are basically 3 options:

1. Make the content completely opaque. Makes modification impossible, but  
the same is true of almost any external resource.

2. Reduce the content to plain text. Modification would then destroy what  
extra style information there was.

3. Transcode the content to HTML+CSS. Basically, while parsing external  
SRT, the UA would construct an equivalent HTML DOM as children of  
<itextlist-overlay-whatever>. This would actually make the MediaTimeRange  
idea above redundant because the information would already be in the DOM.  
All in all though, this would be quite strange and not a serious  
suggestion.

Looking at the above, trying to force time ranges from all sources into a  
single interface isn't looking good. Perhaps the whole effort is  
misguided. Is there in fact a use case for accessing/modifying the time  
ranges and contents of external subtitles? For getting callbacks/events  
when such ranges are entered and left? For styling such content with CSS?

By throwing away all interaction between external subtitles and the DOM,  
cross-origin issues become irrelevant. The only use case I think is  
actually... useful... is styling it with CSS. For now, I will abandon the  
working assumption about SRT firing events, etc.

> Another possibility is that <timerange> elements have contents which  
> automatically become visible or hidden depending on whether content is  
> in the range, so the common use case (make some content appear during  
> certain time ranges of the video) work without any script:
>
> <video>
>      <source type="video/mp4" src="video.m4v">
>      <timerange start="10" end="12">Hello</timerange>
> </video>
>
> The contents could be arbitrary HTML, which would make it very simple to  
> sync a slideshow to a video, in addition to handling the captions use  
> case. CSS styling could be used to position the currently visible  
> <timerange> over the video.
>

I quite like the declarative syntax in the last example, but think that  
<timerange> should have a wrapping element which is the same used to  
reference external time ranges (a.k.a. subtitles). Mostly this is to group  
them into "tracks".

<video>
   <source type="video/mp4" src="video.m4v">
   <itextlist-overlay-whatever lang="zh"  
src="chinese.srt"></itextlist-overlay-whatever>
   <itextlist-overlay-whatever lang="en">
     <timerange start="10" end="12">Hello</timerange>
   </itextlist-overlay-whatever>
   <itextlist-overlay-whatever lang="sv">
     <timerange start="10" end="12">Hej</timerange>
   </itextlist-overlay-whatever>
</video>

I suppose that for styling, we would have a CSS pseudo-classe  
:yourtimeisnow ? A probably default style would then be

timerange { display:none; }
timerange:yourtimeisnow { display: block; }

If we use some declarative time range syntax, surely the next thing people  
will want is to be able to use it outside of <video>.

<video id="v0" src="my-video"></video>
Subtitles below:
<div>
   <timerange start="10" end="12" ref="v0">Hello</timerange>
</div>

Good idea? When people inevitably ask for this, I think we should tell  
them to do it with scripts instead.

-- 
Philip Jägenstedt
Core Developer
Opera Software

Received on Sunday, 29 November 2009 12:11:46 UTC