[Bug 18400] Define and document timestamp heuristics

https://www.w3.org/Bugs/Public/show_bug.cgi?id=18400

--- Comment #4 from Aaron Colwell <acolwell@chromium.org> 2012-08-13 20:38:03 UTC ---
comments inline..

(In reply to comment #2)
> Proposal inline below:
> 
> (In reply to comment #0)
> > There are several situations where heuristics are needed to resolve issues with
> > the timestamps in media segments. The following list indicates issues the
> > Chrome team has encountered so far :
> > 
> > 1. How close does the end of one media segment need to be to the beginning of
> > another to be considered a gapless splice? Media segments can't always align
> > exactly, especially in adaptive content, and they may be close but don't
> > overlap.
> 
> More generally, if there is a gap in the media data in a Source Buffer, the
> media element should play continuously across the gap if the duration of the
> gap is less than 2 (?) video frame intervals or less than 2 (?) audio frame
> durations. Otherwise the media element should pause and wait for receipt of
> data.

[acolwell] Sounds like a reasonable start. How is the "video frame interval"
and "audio frame duration" determined? Media segments could have different
frame rates, and codecs like Vorbis have variable audio frame durations (ie
long & short overlap windows).

> 
> > 
> > 2. How far apart do track ranges need to be for the UA to consider media data
> > to be missing? For example:  audio [5-10) video [5.033-10) and I seek to 5.
> > Technically I don't have video @ t=5, but the UA should likely allow the seek
> > to complete because 5.033 is "close enough".
> 
> This is covered by the rule above for (1).
> 
> If there is media within 2 (?) video frame intervals or 2 (?) audio frame
> durations of the seek position then playback can begin.

[acolwell] I agree.

> 
> > 
> > 3. How close do timestamps need to be to 0 to be equivalent to t=0? Content may
> > not always start at exactly 0 so how much room do we want to allow here, if
> > any? This may be related to #2, but I wanted to call it out just in case we
> > wanted to handle the start time slightly differently.
> 
> I believe the start time should be zero. If the first frame is at time 33ms,
> then that means you should render 33ms of blank screen, then the first frame.
> Rules for whether playback can start are as above.

[acolwell] I agree.

> 
> > 
> > 4. How should the UA estimate the duration of a media segment if the last frame
> > in the segment doesn't have duration information? (ie WebM clusters aren't
> > required to have an explicit cluster duration. It's possible, but not required
> > currently)
> 
> The rules above enable the UA to determine whether there is a real gap between
> segments. This obviates the need to know segment duration except for
> determination of the content duration. The content duration should just be set
> to the timestamp of the last video frame or the end of the last audio frame,
> whichever is later.

[acolwell] This becomes more complicated when overlaps are involved. Without
knowing the actual duration of segments it becomes tricky to resolve certain
kinds of overlaps. I'll try to provide an example to illustrate the problem.


Initial source buffer state.
+-----------+--+--+----------+
:A          |A |A |A         |  
+-----------+--+--+----------+

A new segment gets appended and we don't know it's duration.
+--------+-???
:B       |B     
+--------+-???  

Resolve the overlap and assume the end of the segment goes until the next
frame.
+--------+--+--+--+----------+
:B       |B |A |A |A         | 
+--------+--+--+--+----------+ 

Append the segment that is supposed to be right after B.
               +------+------+
               :C     |C     | 
               +------+------+ 

Resolve the overlap.
+--------+--+--+------+------+
:B       |B |A :C     |C     | 
+--------+--+--+------+------+ 

If B & C had been appended on a clear source buffer you would have gotten this
which is likely what the application intended.
+--------+-----+------+------+
:B       |B    :C     |C     |
+--------+-----+------+------+

This is not a hypothetical example. We actually ran into this problem while
trying to overlap Vorbis data.

Note that a "wait until the next segment is appended" rule won't help here
because segments are not required to be appended in order and discontinuous
appends are not explicitly signalled. 

Assuming a duration of 1-2 frame intervals can also get you into trouble
because it may cause a keyframe to get dropped which could result in the loss
of a whole GOP.

> 
> > 
> > 5. How should SourceBuffer.buffered values be merged into a single
> > HTMLMediaElement.buffered? Simple range intersection? Should heuristic values
> > like estimated duration (#4) or "close enough" values (#2) be applied before
> > computing the intersection?
> 
> The heuristics of (1) should be used to determine SourceBuffered.buffered. i.e.
> gaps of less than 2 frame intervals do not result in disjoint intervals in the
> SourceBuffered.buffered array.
> 
> Then the intersection of the SourceBuffered.buffered arrays for the active
> source buffers appears as the HTMLMediaElement.buffered.

[acolwell] Ok. Does this also apply after endOfStream() is called? Currently
Chrome returns the intersection for all ranges when in "open", but uses the
intersection plus the union of the end ranges if they overlap in "ended". The
main reason was to handle the case where the streams are slightly different
lengths. The union on the last overlapping range at least allows buffered to
reflect playing out to the duration if the streams are farther than 2 intervals
different.

-- 
Configure bugmail: https://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.

Received on Monday, 13 August 2012 20:38:05 UTC