- From: Cyril Concolato <cyril.concolato@telecom-paristech.fr>
- Date: Tue, 26 Feb 2013 09:57:45 +0100
- To: public-html-media@w3.org
- Message-ID: <512C7909.9050908@telecom-paristech.fr>
Le 25/02/2013 21:01, Michael Thornburgh a écrit :
>
> hi Aaron.
>
> i think this could work, along with bug 20901 to handle the "API-level
> discontinuity indicator". in the case of such a discontinuity, you'd
> want the timeline splice and renormalization to happen at the
> appendWindowStart.
>
Similarly, you shouldn't have to know the duration of your ad (to avoid
precision problems), the switch back to the live stream should be
handled by the media engine.
Cyril
>
> regarding the RAP behavior: the idea would be that a mechanism such as
> this could let you lay out data in the source buffer just like you
> could with doing overlaps, only appending in the natural playback
> order instead of having to do weird out-of-order layering. as such,
> the non-RAP behavior should be the same as currently defined with the
> overlap model, with a similar note about "significantly increasing
> implementation complexity and delays at the splice point" as is in
> 2.1.3 End Overlap.
>
> i think abort() should also reset the appendWindow properties. the
> names seem fine to me.
>
> i will open a bug.
>
> -mike
>
> *From:*Aaron Colwell [mailto:acolwell@google.com]
> *Sent:* Monday, February 25, 2013 8:17 AM
> *To:* Michael Thornburgh
> *Cc:* public-html-media@w3.org
> *Subject:* Re: [MSE] buffering/splicing/overlap model, ad insertion
> and video editing goals
>
> Hi Michael,
>
> I think I see what you are getting at here. I believe this
> functionality would essentially sit between steps 7 & 8 of the coded
> frame algorithm
> <https://dvcs.w3.org/hg/html-media/raw-file/default/media-source/media-source.html#sourcebuffer-coded-frame-processing> and
> would act as a "coded frame filter" or "append window". How about
> this for an initial proposal.
>
> *Proposal:*
>
> partial interface SourceBuffer {
>
> attribute double appendWindowStart;
>
> attribute unrestricted double appendWindowEnd;
>
> }
>
> - appendWindowStart is initially set to 0;
>
> - appendWindowEnd is initially set to positive Infinity.
>
> - Setting appendWindowStart throws an exception if one tries to set it
> to a value >= appendWindowEnd.
>
> - Setting appendWindowEnd throws and exception if one tries to set it
> to a value <= appendWindowStart.
>
> - The attributes can only be modified when updating == false, just
> like timestampOffset.
>
> - The coded frame processing algorithm drops coded frames w/
> presentationTimestamp < appendWindowStart
>
> - The coded frame processing algorithm drops coded frames w/
> presentationTimestamp >= appendWindowEnd.
>
> - If a coded frame is dropped before appendWindowStart, then a "needs
> RAP" flag is set so that the coded frame processing algorithm will
> continue to drop coded frames until it receives a RAP with a
> presentation timestamp >= appendWindowStart.
>
> *Questions:*
>
> - Should abort() reset appendWindowStart & appendWindowEnd to 0 &
> positive Infinity respectively?
>
> - Any suggestions on better names for these attributes?
>
> I believe this proposal would address most of your concerns. It will
> not support seamlessly splicing at a non-RAP boundary, but I'd like to
> defer that until v2 if possible since it would require having 2
> decoder instances and/or require faster than real-time decoding. I'd
> like to nail down the simpler RAP based splices and get interop before
> diving into the non-RAP case.
>
> If folks are ok with this, then I'd say file a bug and I'll start
> working on adding this to the spec.
>
> Aaron
>
> On Thu, Feb 21, 2013 at 11:48 AM, Michael Thornburgh
> <mthornbu@adobe.com <mailto:mthornbu@adobe.com>> wrote:
>
>
> the current buffering/splicing/overlap model for media segments
> implies that the intended granularity for the "ad insertion" and
> "video editing" goals (section 1.1) is "whole segments". the overlap
> & splicing behavior seems to be designed primarily for the adaptive
> streaming case, not necessarily for ad insertion and definitely not
> for the general "video editing" case (of which ad insertion is a subset).
>
> consider programs A (the "main program") and B (the "ad"), with A
> being live. the stream encoder/segmenter will typically be
> free-running, making random access points and segment boundaries in
> natural places independent of any external cue inputs. an operator
> may at some point push the "ad goes here" button, which should only
> have to create a cue marker in the manifest file. it may be
> impractical or infeasible to affect the operation of the
> encoder/segmenter to create a segment boundary at the ad-start or
> ad-end-and-main-program-resumes points.
>
>
> 0s 14s 31s 42s
> +-- cue B +-- cue A
> prog A v v
> |-----------|----:vvvvvv|. . . . .|vvvvvvvvvv:---|-----------|-----------|
> A1(1) A2 : A3(-) A4(4) : A5(7) A6(8)
> (2) :B1(3) B2(5) B3(6) :
> |---------|---------|-------|
> prog B
> 0s 28s
> 1. append A1;
> 2. append A2;
> 3. append B1 at +14s in;
> 4. append A4;
> 5. append B2 at +14s in;
> 6. append B3 at +14s in;
> 7. append A5;
> 8. append A6...
>
>
> in this example, main program segment A4 is overlapped by ad segments
> B2 and B3. this can be accommodated with the current
> buffering/overlap model, but in a fairly unnatural way. to achieve
> the desired rendering, the append order must be [A1, A2, B1, A4, B2,
> B3, A5, A6, ...] -- in other words, not in the natural playback order.
> every application will need to implement a segment overlap scheduler
> to get this ordering right. note also that there is a race with the
> playback position vs the appends, where if you're running close to the
> playback position, you might display a portion of the wrong program
> (for example, missing the beginning of an ad or temporarily switching
> back to the main program in the middle of the ad).
>
> this works for the ad insertion case because the advertiser will
> typically want their entire ad played from beginning to end. for the
> general "video editing" case, there's no way to come in to program B
> at not-a-segment-boundary from program A not-a-segment-boundary, using
> the current model.
>
> some months ago i did some experiments/proofs-of-concept with seamless
> ad insertion at non-segment/non-keyframe boundaries in Flash Player
> (built on top of the "appendBytes" APIs). i had 4 simple primitives
> that gave general editing capabilities in the natural segment playback
> order, with no races (if data was late, playback would stall rather
> than playing the wrong thing):
>
> 1) append segment data;
> 2) discontinuity;
> 3) stop appending from segment at time Te (until discontinuity);
> 4) after discontinuity, start playback from new segment at time Tb
> (not necessarily at a keyframe, like a seek).
>
> for the ad insertion example above, this looks like:
>
>
> 0s 14s 0s 11s
> +-- cue B +-- cue A
> prog A v v
> |-----------|----:XXXXXX|. . . . .|>>>>>>>>>>:---|-----------|-----------|
> A1(1) A2 : A3(-) A4(6) : A5(7) A6(8)
> (2) :B1(3) B2(4) B3(5) :
> |---------|---------|-------|
> prog B
> 0s 28s
>
> 1. append A1;
> 2.
> 2a. stop at 14s in (Te=14s);
> 2b. append A2;
> 3.
> 3a. discontinuity;
> 3b. start next segment 0s in (Tb=0s relative)
> 3c. append B1 at discontinuity;
> 4. append B2;
> 5. append B3;
> 6.
> 6a. discontinuity;
> 6b. start next segment 11s in (Tb=11s relative);
> 6c. append A4 (skipping ahead to 11s in) at discontinuity;
> 7. append A5;
> 8. append A6...
>
> note that this model could also support starting in on B at
> not-the-beginning and ending at not-the-end, if that was desired.
>
> if it's the intention that ad insertion (and editing in general)
> should always be at segment boundaries, then the complications i
> described above go away and you can just append in the natural
> playback order. however, i believe real-world use scenarios
> (especially ad insertion into live streams) will require seamless
> splicing at not-segment-boundaries, requiring implementation of the
> complicated scheduling and non-natural append order described above,
> as well as exposure to possible races. i believe it would be
> advantageous to support this use case in a more natural way.
>
> -michael thornburgh
>
>
--
Cyril Concolato
Maître de Conférences/Associate Professor
Groupe Multimedia/Multimedia Group
Telecom ParisTech
46 rue Barrault
75 013 Paris, France
http://concolato.wp.mines-telecom.fr/
Received on Tuesday, 26 February 2013 08:58:11 UTC