Re: MSE: Ad-insertion and seeking from David LaPalomento on 2015-08-28 (public-html-media@w3.org from August 2015)

From: David LaPalomento <dlapalomento@brightcove.com>
Date: Fri, 28 Aug 2015 10:10:48 -0400
To: Matt Wolenetz <wolenetz@google.com>
Cc: public-html-media@w3.org
Message-ID: <CACh87od10abVpghbuR-UD8k37zZPTFv42egE7+_1whgxNqWw_Q@mail.gmail.com>
Maybe I'm misunderstanding the intended usage of the API. Is it possible to
seek to a later point in the video where there is no buffered range end at
the position we want to append the media? That's my big assumption going
into this and why I'm unable to provide "x" in your example.

I thought putting some visuals together might clarify the problem:
https://github.com/dmlap/seeking-mse-example/blob/master/seeking-across-discontinuities.md.
While I was working on that, it occurred to me we could work around this by
removing all buffered regions ahead of the current media position when the
user seeks. We could end up doing some not-strictly-necessary re-buffering
but we'd avoid the possibility of content overlap. Assuming the answer to
my question in the first paragraph is "yes", does that solution sound like
the right approach to you?

On Wed, Aug 26, 2015 at 8:42 PM, Matt Wolenetz <wolenetz@google.com> wrote:

> Hi David,
>
> #1 isn't a necessary prerequisite for #2 in your example, if I understand
> correctly. In fact, surprising behavior may occur if you overlap-append an
> existing buffered range right at, or near, currentTime in the timeline.
> Chrome, for example, attempts to play out the remainder of an overlapped
> GOP until the next keyframe in the newly appended media, but upon seeking
> back, may play more of the newer content in the overlapped-append region.
> This is an artifact of varying decoder pipeline depths.
>
> I am also confused why imprecise duration is the issue when calculating
> timestampOffset to close the gap. The app can reliably inspect
> SourceBuffer.buffered, and needs a reliably precise expected start
> timestamp="y" of the media it is about to append. Given those, it could set
> SourceBuffer.timestampOffset = ("x" = the time from SourceBuffer.buffered
> corresponding to the end time of the buffered range just prior to the point
> in the timeline where you want the new media actually to be appended) - ("y"
> = start timestamp of media that is about to be appended, from the
> bytestream or some other reliable source), and then append the media. This
> should cause all timestamps in the newly appended media to be adjusted.
> Assuming the newly appended media is a single continuously
> increasing-in-decode-timestamp sequence, this sequence's timestamps will be
> adjusted to move it to begin right at the desired time "x" in the timeline.
> If the sequence is discontinuous and the app wishes to collapse all gaps,
> it would need to append the media segments more granularly (each segment
> should be in DTS sequence and continuous), and adapt timestampOffset
> between each. Essentially, this is doing an approximated polyfill of
> "sequence" appendMode.
>
>
> On Wed, Aug 26, 2015 at 7:34 AM David LaPalomento <
> dlapalomento@brightcove.com> wrote:
>
>> Hi Matt,
>> Thanks for the response! "sequence" mode does sound like it could make
>> discontinuity handling less of a pain. Just to make sure I follow your
>> suggestion, tell me if this sounds right to you:
>>
>> 1) video.currentTime is set to a value that causes playback to cross over
>> a known timestamp discontinuity.
>> 2) The application selects a value for sourceBuffer.timestampOffset to
>> adjust the target media's timestamps to account for the discontinuity and
>> allow playback to begin.
>>
>> This works today in my testing so far. My problem happens a little bit
>> further on from this, though. Since I'm dealing with somewhat unreliable
>> third-party content, I can't be confident of their duration without
>> downloading and inspecting them. That leaves me with the tough choice of
>> taking a guess on the appropriate timestampOffset in step 2) and risk
>> overlapping content if the user seeks back and plays through the content
>> again; or downloading all intervening segments before the seek can complete
>> and forcing the user to sit through a painful amount of buffering.
>>
>> Basically, ad insertion requires dealing with third-party content and (in
>> my experience, at least) you can't rely on those parties for accurate
>> duration information to set timestampOffset. Does that make sense? Did I
>> miss something from your advice?
>>
>> On Tue, Aug 25, 2015 at 5:43 PM, Matt Wolenetz <wolenetz@google.com>
>> wrote:
>>
>>> Hi David,
>>>
>>> I am one of the current co-editors of the MSE spec. Thanks for your
>>> question.
>>>
>>> This appears to be a problem that "sequence" appendMode may alleviate:
>>> it collapses all discontinuities into one continuous buffered region, so
>>> long as there are no other intervening operations like explicitly changing
>>> timestampOffset or appendMode. This "sequence" appendMode is in
>>> experimental support currently in Chrome M46 (it's hidden behind an
>>> experimental flag). The caveat for using sequence appendMode (beyond, of
>>> course, having implementations in user agents) is that the appends must be
>>> done in the order they are desired on the timeline, not in a scattered
>>> fashion.
>>>
>>> While implementations are pending "sequence" appendMode support,
>>> explicitly updating timestampOffset to collapse potentential
>>> discontinuities would be feasible if you:
>>> 1) know the current SourceBuffer.buffered() range(s) end time(s): this
>>> is available in MSE.
>>> 2) know the start timestamp of media about to be appended (by inspection
>>> offline, or even in a js parser)
>>>
>>> Combined, these are like timestamp rewriting, except the rewriting is
>>> done implicitly by timestampOffset, rather than updating the timecodes in
>>> the appended byte stream.
>>>
>>> Is the latter method (explicitly updating timestampOffset using data
>>> from the API and from the byte stream (or offline inspection/metadata/some
>>> other assumption)) sufficient for your use case?
>>>
>>> Matt
>>>
>>> On Tue, Aug 25, 2015 at 7:53 AM David LaPalomento <
>>> dlapalomento@brightcove.com> wrote:
>>>
>>>> Hi all,
>>>> I'm a contributor to video.js and am working to convert an existing
>>>> Flash-based HLS playback plugin to using Media Source Extensions. We
>>>> support a number of server-side ad insertion services, all of which seem to
>>>> compose existing media files without doing timestamp rewriting and signal
>>>> this to the player through metadata in the HLS manifest.
>>>>
>>>> One of the technical challenges we've faced in the existing
>>>> implementation is handling seeking across multiple timestamp
>>>> discontinuities before the entire video has been downloaded. HLS v3 rounds
>>>> segment durations to the nearest whole number which can introduce a
>>>> significant amount of timeline error in long-form content. Ignoring the
>>>> shortcomings of HLS though, the duration values provided by ad-insertion
>>>> services may lack precision and the wild-west of ad creatives doesn't help
>>>> the situation.
>>>>
>>>> We handle this issue today by recalculating the media timeline whenever
>>>> a new segment is downloaded and processed. Since the buffer always grows
>>>> forward, media timeline adjustments occur ahead of the current playback
>>>> position and the player's media timeline converges on reality as more
>>>> content is buffered.
>>>>
>>>> Preamble out of the way, here's my question for this group: how would
>>>> one seek across discontinuities without frame-accurate durations using
>>>> Media Source Extensions? If we had perfectly accurate duration information,
>>>> I believe we could use timestamp offsets on the source buffer to place the
>>>> new content at the appropriate position. With inaccurate or low-precision
>>>> duration information, it seems like we run the risk of mis-placing the
>>>> media data and creating overlaps at discontinuities and misreporting the
>>>> total content duration. Is there a solution in the spec I'm missing?
>>>>
>>>
>>
Received on Friday, 28 August 2015 14:11:17 UTC