- From: Cyril Concolato <cyril.concolato@telecom-paristech.fr>
- Date: Tue, 26 Feb 2013 09:57:45 +0100
- To: public-html-media@w3.org
- Message-ID: <512C7909.9050908@telecom-paristech.fr>
Le 25/02/2013 21:01, Michael Thornburgh a écrit : > > hi Aaron. > > i think this could work, along with bug 20901 to handle the "API-level > discontinuity indicator". in the case of such a discontinuity, you'd > want the timeline splice and renormalization to happen at the > appendWindowStart. > Similarly, you shouldn't have to know the duration of your ad (to avoid precision problems), the switch back to the live stream should be handled by the media engine. Cyril > > regarding the RAP behavior: the idea would be that a mechanism such as > this could let you lay out data in the source buffer just like you > could with doing overlaps, only appending in the natural playback > order instead of having to do weird out-of-order layering. as such, > the non-RAP behavior should be the same as currently defined with the > overlap model, with a similar note about "significantly increasing > implementation complexity and delays at the splice point" as is in > 2.1.3 End Overlap. > > i think abort() should also reset the appendWindow properties. the > names seem fine to me. > > i will open a bug. > > -mike > > *From:*Aaron Colwell [mailto:acolwell@google.com] > *Sent:* Monday, February 25, 2013 8:17 AM > *To:* Michael Thornburgh > *Cc:* public-html-media@w3.org > *Subject:* Re: [MSE] buffering/splicing/overlap model, ad insertion > and video editing goals > > Hi Michael, > > I think I see what you are getting at here. I believe this > functionality would essentially sit between steps 7 & 8 of the coded > frame algorithm > <https://dvcs.w3.org/hg/html-media/raw-file/default/media-source/media-source.html#sourcebuffer-coded-frame-processing> and > would act as a "coded frame filter" or "append window". How about > this for an initial proposal. > > *Proposal:* > > partial interface SourceBuffer { > > attribute double appendWindowStart; > > attribute unrestricted double appendWindowEnd; > > } > > - appendWindowStart is initially set to 0; > > - appendWindowEnd is initially set to positive Infinity. > > - Setting appendWindowStart throws an exception if one tries to set it > to a value >= appendWindowEnd. > > - Setting appendWindowEnd throws and exception if one tries to set it > to a value <= appendWindowStart. > > - The attributes can only be modified when updating == false, just > like timestampOffset. > > - The coded frame processing algorithm drops coded frames w/ > presentationTimestamp < appendWindowStart > > - The coded frame processing algorithm drops coded frames w/ > presentationTimestamp >= appendWindowEnd. > > - If a coded frame is dropped before appendWindowStart, then a "needs > RAP" flag is set so that the coded frame processing algorithm will > continue to drop coded frames until it receives a RAP with a > presentation timestamp >= appendWindowStart. > > *Questions:* > > - Should abort() reset appendWindowStart & appendWindowEnd to 0 & > positive Infinity respectively? > > - Any suggestions on better names for these attributes? > > I believe this proposal would address most of your concerns. It will > not support seamlessly splicing at a non-RAP boundary, but I'd like to > defer that until v2 if possible since it would require having 2 > decoder instances and/or require faster than real-time decoding. I'd > like to nail down the simpler RAP based splices and get interop before > diving into the non-RAP case. > > If folks are ok with this, then I'd say file a bug and I'll start > working on adding this to the spec. > > Aaron > > On Thu, Feb 21, 2013 at 11:48 AM, Michael Thornburgh > <mthornbu@adobe.com <mailto:mthornbu@adobe.com>> wrote: > > > the current buffering/splicing/overlap model for media segments > implies that the intended granularity for the "ad insertion" and > "video editing" goals (section 1.1) is "whole segments". the overlap > & splicing behavior seems to be designed primarily for the adaptive > streaming case, not necessarily for ad insertion and definitely not > for the general "video editing" case (of which ad insertion is a subset). > > consider programs A (the "main program") and B (the "ad"), with A > being live. the stream encoder/segmenter will typically be > free-running, making random access points and segment boundaries in > natural places independent of any external cue inputs. an operator > may at some point push the "ad goes here" button, which should only > have to create a cue marker in the manifest file. it may be > impractical or infeasible to affect the operation of the > encoder/segmenter to create a segment boundary at the ad-start or > ad-end-and-main-program-resumes points. > > > 0s 14s 31s 42s > +-- cue B +-- cue A > prog A v v > |-----------|----:vvvvvv|. . . . .|vvvvvvvvvv:---|-----------|-----------| > A1(1) A2 : A3(-) A4(4) : A5(7) A6(8) > (2) :B1(3) B2(5) B3(6) : > |---------|---------|-------| > prog B > 0s 28s > 1. append A1; > 2. append A2; > 3. append B1 at +14s in; > 4. append A4; > 5. append B2 at +14s in; > 6. append B3 at +14s in; > 7. append A5; > 8. append A6... > > > in this example, main program segment A4 is overlapped by ad segments > B2 and B3. this can be accommodated with the current > buffering/overlap model, but in a fairly unnatural way. to achieve > the desired rendering, the append order must be [A1, A2, B1, A4, B2, > B3, A5, A6, ...] -- in other words, not in the natural playback order. > every application will need to implement a segment overlap scheduler > to get this ordering right. note also that there is a race with the > playback position vs the appends, where if you're running close to the > playback position, you might display a portion of the wrong program > (for example, missing the beginning of an ad or temporarily switching > back to the main program in the middle of the ad). > > this works for the ad insertion case because the advertiser will > typically want their entire ad played from beginning to end. for the > general "video editing" case, there's no way to come in to program B > at not-a-segment-boundary from program A not-a-segment-boundary, using > the current model. > > some months ago i did some experiments/proofs-of-concept with seamless > ad insertion at non-segment/non-keyframe boundaries in Flash Player > (built on top of the "appendBytes" APIs). i had 4 simple primitives > that gave general editing capabilities in the natural segment playback > order, with no races (if data was late, playback would stall rather > than playing the wrong thing): > > 1) append segment data; > 2) discontinuity; > 3) stop appending from segment at time Te (until discontinuity); > 4) after discontinuity, start playback from new segment at time Tb > (not necessarily at a keyframe, like a seek). > > for the ad insertion example above, this looks like: > > > 0s 14s 0s 11s > +-- cue B +-- cue A > prog A v v > |-----------|----:XXXXXX|. . . . .|>>>>>>>>>>:---|-----------|-----------| > A1(1) A2 : A3(-) A4(6) : A5(7) A6(8) > (2) :B1(3) B2(4) B3(5) : > |---------|---------|-------| > prog B > 0s 28s > > 1. append A1; > 2. > 2a. stop at 14s in (Te=14s); > 2b. append A2; > 3. > 3a. discontinuity; > 3b. start next segment 0s in (Tb=0s relative) > 3c. append B1 at discontinuity; > 4. append B2; > 5. append B3; > 6. > 6a. discontinuity; > 6b. start next segment 11s in (Tb=11s relative); > 6c. append A4 (skipping ahead to 11s in) at discontinuity; > 7. append A5; > 8. append A6... > > note that this model could also support starting in on B at > not-the-beginning and ending at not-the-end, if that was desired. > > if it's the intention that ad insertion (and editing in general) > should always be at segment boundaries, then the complications i > described above go away and you can just append in the natural > playback order. however, i believe real-world use scenarios > (especially ad insertion into live streams) will require seamless > splicing at not-segment-boundaries, requiring implementation of the > complicated scheduling and non-natural append order described above, > as well as exposure to possible races. i believe it would be > advantageous to support this use case in a more natural way. > > -michael thornburgh > > -- Cyril Concolato Maître de Conférences/Associate Professor Groupe Multimedia/Multimedia Group Telecom ParisTech 46 rue Barrault 75 013 Paris, France http://concolato.wp.mines-telecom.fr/
Received on Tuesday, 26 February 2013 08:58:11 UTC