- From: Aaron Colwell <acolwell@google.com>
- Date: Mon, 25 Feb 2013 08:16:46 -0800
- To: Michael Thornburgh <mthornbu@adobe.com>
- Cc: "public-html-media@w3.org" <public-html-media@w3.org>
- Message-ID: <CAA0c1bAS5=0tK9kLK=2O-sgux4+_GoMay_1JXagMtjyTY2w9OA@mail.gmail.com>
Hi Michael, I think I see what you are getting at here. I believe this functionality would essentially sit between steps 7 & 8 of the coded frame algorithm<https://dvcs.w3.org/hg/html-media/raw-file/default/media-source/media-source.html#sourcebuffer-coded-frame-processing> and would act as a "coded frame filter" or "append window". How about this for an initial proposal. *Proposal:* partial interface SourceBuffer { attribute double appendWindowStart; attribute unrestricted double appendWindowEnd; } - appendWindowStart is initially set to 0; - appendWindowEnd is initially set to positive Infinity. - Setting appendWindowStart throws an exception if one tries to set it to a value >= appendWindowEnd. - Setting appendWindowEnd throws and exception if one tries to set it to a value <= appendWindowStart. - The attributes can only be modified when updating == false, just like timestampOffset. - The coded frame processing algorithm drops coded frames w/ presentationTimestamp < appendWindowStart - The coded frame processing algorithm drops coded frames w/ presentationTimestamp >= appendWindowEnd. - If a coded frame is dropped before appendWindowStart, then a "needs RAP" flag is set so that the coded frame processing algorithm will continue to drop coded frames until it receives a RAP with a presentation timestamp >= appendWindowStart. *Questions:* - Should abort() reset appendWindowStart & appendWindowEnd to 0 & positive Infinity respectively? - Any suggestions on better names for these attributes? I believe this proposal would address most of your concerns. It will not support seamlessly splicing at a non-RAP boundary, but I'd like to defer that until v2 if possible since it would require having 2 decoder instances and/or require faster than real-time decoding. I'd like to nail down the simpler RAP based splices and get interop before diving into the non-RAP case. If folks are ok with this, then I'd say file a bug and I'll start working on adding this to the spec. Aaron On Thu, Feb 21, 2013 at 11:48 AM, Michael Thornburgh <mthornbu@adobe.com>wrote: > > the current buffering/splicing/overlap model for media segments implies > that the intended granularity for the "ad insertion" and "video editing" > goals (section 1.1) is "whole segments". the overlap & splicing behavior > seems to be designed primarily for the adaptive streaming case, not > necessarily for ad insertion and definitely not for the general "video > editing" case (of which ad insertion is a subset). > > consider programs A (the "main program") and B (the "ad"), with A being > live. the stream encoder/segmenter will typically be free-running, making > random access points and segment boundaries in natural places independent > of any external cue inputs. an operator may at some point push the "ad > goes here" button, which should only have to create a cue marker in the > manifest file. it may be impractical or infeasible to affect the operation > of the encoder/segmenter to create a segment boundary at the ad-start or > ad-end-and-main-program-resumes points. > > > 0s 14s 31s 42s > +-- cue B +-- cue A > prog A v v > |-----------|----:vvvvvv|. . . . .|vvvvvvvvvv:---|-----------|-----------| > A1(1) A2 : A3(-) A4(4) : A5(7) A6(8) > (2) :B1(3) B2(5) B3(6) : > |---------|---------|-------| > prog B > 0s 28s > 1. append A1; > 2. append A2; > 3. append B1 at +14s in; > 4. append A4; > 5. append B2 at +14s in; > 6. append B3 at +14s in; > 7. append A5; > 8. append A6... > > > in this example, main program segment A4 is overlapped by ad segments B2 > and B3. this can be accommodated with the current buffering/overlap model, > but in a fairly unnatural way. to achieve the desired rendering, the > append order must be [A1, A2, B1, A4, B2, B3, A5, A6, ...] -- in other > words, not in the natural playback order. every application will need to > implement a segment overlap scheduler to get this ordering right. note > also that there is a race with the playback position vs the appends, where > if you're running close to the playback position, you might display a > portion of the wrong program (for example, missing the beginning of an ad > or temporarily switching back to the main program in the middle of the ad). > > this works for the ad insertion case because the advertiser will typically > want their entire ad played from beginning to end. for the general "video > editing" case, there's no way to come in to program B at > not-a-segment-boundary from program A not-a-segment-boundary, using the > current model. > > some months ago i did some experiments/proofs-of-concept with seamless ad > insertion at non-segment/non-keyframe boundaries in Flash Player (built on > top of the "appendBytes" APIs). i had 4 simple primitives that gave > general editing capabilities in the natural segment playback order, with no > races (if data was late, playback would stall rather than playing the wrong > thing): > > 1) append segment data; > 2) discontinuity; > 3) stop appending from segment at time Te (until discontinuity); > 4) after discontinuity, start playback from new segment at time Tb (not > necessarily at a keyframe, like a seek). > > for the ad insertion example above, this looks like: > > > 0s 14s 0s 11s > +-- cue B +-- cue A > prog A v v > |-----------|----:XXXXXX|. . . . .|>>>>>>>>>>:---|-----------|-----------| > A1(1) A2 : A3(-) A4(6) : A5(7) A6(8) > (2) :B1(3) B2(4) B3(5) : > |---------|---------|-------| > prog B > 0s 28s > > 1. append A1; > 2. > 2a. stop at 14s in (Te=14s); > 2b. append A2; > 3. > 3a. discontinuity; > 3b. start next segment 0s in (Tb=0s relative) > 3c. append B1 at discontinuity; > 4. append B2; > 5. append B3; > 6. > 6a. discontinuity; > 6b. start next segment 11s in (Tb=11s relative); > 6c. append A4 (skipping ahead to 11s in) at discontinuity; > 7. append A5; > 8. append A6... > > note that this model could also support starting in on B at > not-the-beginning and ending at not-the-end, if that was desired. > > if it's the intention that ad insertion (and editing in general) should > always be at segment boundaries, then the complications i described above > go away and you can just append in the natural playback order. however, i > believe real-world use scenarios (especially ad insertion into live > streams) will require seamless splicing at not-segment-boundaries, > requiring implementation of the complicated scheduling and non-natural > append order described above, as well as exposure to possible races. i > believe it would be advantageous to support this use case in a more natural > way. > > -michael thornburgh > > > >
Received on Monday, 25 February 2013 16:17:17 UTC