Re: Dropping cues when playbackRate != 1.0

On Aug 22, 2013, at 6:32 AM, Silvia Pfeiffer <silviapfeiffer1@gmail.com> wrote:

> 
> 
> 
> On Thu, Aug 22, 2013 at 10:28 AM, Eric Carlson <eric.carlson@apple.com> wrote:
> 
> On Aug 21, 2013, at 5:12 PM, Silvia Pfeiffer <silviapfeiffer1@gmail.com> wrote:
> 
>> 
>> 
>> 
>> On Sun, Aug 18, 2013 at 12:42 AM, Eric Carlson <eric.carlson@apple.com> wrote:
>> 
>> 
>> On Aug 17, 2013, at 5:40 AM, Silvia Pfeiffer <silviapfeiffer1@gmail.com> wrote:
>> 
>>> 
>>> 
>>> 
>>> On Sat, Aug 17, 2013 at 3:17 AM, Eric Carlson <eric.carlson@apple.com> wrote:
>>> 
>>> On Aug 8, 2013, at 7:21 PM, Silvia Pfeiffer <silviapfeiffer1@gmail.com> wrote:
>>> 
>>> > On Fri, Aug 9, 2013 at 6:11 AM, Brendan Long <self@brendanlong.com> wrote:
>>> >> There's section of the HTML5 CR spec saying:
>>> >>
>>> >> Similarly, when the playback rate is not exactly 1.0, hardware, software, or
>>> >> format limitations can cause video frames to be dropped and audio to be
>>> >> choppy or muted.
>>> >>
>>> >>
>>> >> I think this should also apply to text track cues, something like:
>>> >>
>>> >> Similarly, when the playback rate is not exactly 1.0, hardware, software, or
>>> >> format limitations can cause video frames and text cues to be dropped and
>>> >> audio to be choppy or muted.
>>> >>
>>> >> When playing at non-standard speeds, an efficient implementation may want to
>>> >> skip portions of the file, which could mean skipping cues.
>>> >
>>> > I assume you are talking about a situation where the playback would
>>> > need to jump multiple seconds (call them "s") at a time to achieve the
>>> > speedup.
>>> > And you are further assuming that when the current time jumps from
>>> > second x to second x+s, there may be a cue that would have both their
>>> > start and end time between x and x+s.
>>> > So you're saying that if there is no audio or video rendered to which
>>> > the cue refers, the "time marches on" algorithm doesn't actually
>>> > activate these cues and therefore they are "dropped".
>>> >
>>> > In actual fact, the "time marches on" algorithm will activate the
>>> > enter and exit events of the "missed cues", so they are not really
>>> > "dropped". They are, however, not rendered, because they don't become
>>> > active.
>>> >
>>>   This requires that all of the media data for time X to X+S has to be loaded before skipping ahead.
>>> 
>>> Why? Only the media data for time x and for time x+s has to be loaded - nothing else.
>>> 
>>>  
>>> This means that a UA can not play at a faster rate by not loading the entire file, for example by only loading and displaying key frames. 
>>> 
>>>   If this is allowed, and it seems silly to prevent it, cues defined in the portions of the file that are not loaded would have to be skipped.
>>> 
>>> I guess we have two different kind of cues: those coming from <track> and those coming from inside a media file.
>>> 
>>> Those from a <track> element are all loaded and events could be activated as the timeline skips over the cue start/end times. Cue content would be rendered when X or X+S is within a cue's interval.
>>> 
>>> Those from inside a media file, if multiplexed, may get completely skipped by going from X to X+S without the browser loading those parts of a file. So, in this case you'd probably skip cues.
>>> 
>> 
>>   Yes, sorry for being unclear, I was talking about in-band captions. 
>> 
>> 
>> I'm trying to figure out whether this may be a problem, both because it's inconsistent between in-band and <track> based text tracks, and because JS developers may need to be notified of skipped cues.
>> 
>> In your experience, for in-band tracks, would it be possible to require that the browser not fast-forward text tracks, but only skip audio / video track sections?
>> 
> 
>   Not unless we disallow seeking to an un-buffered time. There is no chance I would make a change like that.
> 
>> I guess it would require writing some sort of index of the cues into the file. It's possible in Ogg with a special Skeleton header for text tracks.
>> 
> 
>   That won't help because the index will only give you the cue timestamps, not the text.
> 
> That would be enough to raise events for skipped cues. 
> 
  No. 'enter' and 'exit' events target the TextTrackCue object [1], so if you don't have enough data to construct the cue object you don't have enough data to fire the event. The index only gives you a cue's start times so there isn't enough data to construct the cue object.

> Are you doing the fast forward/reverse by skipping byte ranges or by skipping via an index?
> 
  They are orthogonal. If a file has an "index" ('moov' chunk, Skeleton header, etc), you always use that to determine the offset of the seek destination. If the data for the seek destination has not already been loaded, you use a byte-range request to get it from the server. 

  If a file doesn't have an index, eg. an MPEG-2 transport stream or MP3 file, you guess the file offset that corresponds to the seek time, seek there and look at timestamps.

> If you have an index of the locations in the file at which the cues are, both as a time and byte offset, then you can make sure not to seek past any cues. So, I think it could work. It would, of course, all depend on how the fast forward/reverse is implemented.
> 
  It could be made to work, but only if you require a UA to always load *all* cue data. 

eric

[1] http://www.w3.org/TR/html5/embedded-content-0.html#prepare-an-event

Received on Thursday, 22 August 2013 16:18:48 UTC