- From: Mark Watson <watsonm@netflix.com>
- Date: Mon, 18 Jun 2012 18:41:41 +0000
- To: Steven Robertson <strobe@google.com>
- CC: Duncan Rowden <Duncan.Rowden@bbc.co.uk>, Aaron Colwell <acolwell@google.com>, "<public-html-media@w3.org>" <public-html-media@w3.org>
All, I think there are a couple of mis-understandings in the thread below. Firstly, this offset mechanism is not intended as a general solution to ad insertion. Certain simple kinds of ad insertion can be implemented this way (specifically the kind where the ads occupy specific intervals on a simple global linear presentation timeline and the only problem is mapping the ad-internal media timestamps to that timeline. For more complex ad insertion scenarios, I think we need to assume these will be taken care of by the application (absent some more worked out use-cases and proposals for browser features to support them). Secondly, I don't see any practical difference between the two proposals. There is no more or less need for "mapping tables" at the application in either case. It's am important feature of the current draft that media segments carry their own timing. No assumptions are made about the timing of a media segment on the basis of the previous media segment. For example, suppose I have media segments A with internal media timestamp range [ 0s, 2s ), B with [ 2s, 4s ) and C with [ 20s, 22s ). If I append A then B the result is a source buffer with media from 0s to 4s. If I append C then B the result is a source buffer with media from 2s to 4s and 20s to 22s. B ends up in the same place whatever went before. So, to correctly make use of either proposed function for insertion of content with a different internal timeline, the application needs knowledge about the current position in the timeline (i.e. the media internal timestamps of the already provided data) and the media internal timestamps of the to-be-inserted data. Another example. Suppose I have segments with media timestamps filling the range D= [0s, 60s) and more filling the range E=[60s,120s) and an advert the internal timestamps of which file the range F= [15,30s). If I want to play D then F then E, using mapping, I feed D, then a mapping (15s media -> 60s presentation), then F then another mapping (60s media -> 75s presentation) then E. Using offsets I feed D, then offset 45s, then F then offset 15s then E. I need to know the internal timestamps of both the segments and the advert, whether I use the "mapping" or "offset" function. Based on the above, I have no strong preference. The mapping option is perhaps better because it is more explicit. that the script needs to knowledge described above. …Mark On Jun 18, 2012, at 10:01 AM, Steven Robertson wrote: > On Mon, Jun 18, 2012 at 4:25 AM, Duncan Rowden <Duncan.Rowden@bbc.co.uk> wrote: >> - Ideally, when inserting an advert, this won’t contaminate the >> main media item’s timeline. The reasoning for this is that when ads are >> inserted, a content provider may only wish these to be viewed once. So when >> rewinding back through a given point either another ad can be inserted or >> playback can occur without showing an ad. > > My understanding was that duration calculation from an initialization > segment was not possible, at least in BMFF, given that neither edit > lists nor the 'mehd' box are required, and anyway hitherto-unimagined > uses of the Media Source API would be difficult to account for with > any standard notion of a single timeline. As a result, it seems like > the intention is to leave presentation of a scrubber and transport > controls to the application for complex cases. It's easy to verify the > behavior of a scrubber across multiple platforms (just look at it and > maybe interact with it); it seems much harder to deal with subtle > implementation-specific details of a source buffer switching mechanism > across multiple platforms, since one can only interact with it > temporally and indirectly. > > Although using source-switching to composite seamlessly to a single > timeline below the source-buffer level is conceptually similar to > compositing to a single timeline at an application-controlled level, I > feel that source-switching would be the origin of a lot of platform > quirks that would be challenging to understand and correct for. The > source buffer logic is already both very author-friendly and quite > non-trivial to implement correctly; adding another invisible layer > beneath that would be more complicated still. I believe that making > things easier on implementors by giving them a single contiguous > stream of buffers with consistent timestamps – which is basically what > they get anyway from a traditional demuxer – will result in much more > consistent behavior across platforms, which is better for authors and > users. > >> - The point at which ads need to be inserted may not necessarily >> coincide with a segment boundary, which as I understand it would be required >> using this proposal. > > Interstitials must always start _with_ a segment boundary, but they > don't necessarily need to be inserted starting _on_ a segment boundary > in the original timeline. A UA can call sourceAbort() and start > appending the interstitial, or can simply append a complete segment of > the original content and then append the interstitial media. > > [That said, the quality of experience on some platforms is likely to > be significantly degraded (i.e. system crash) if interstitials start > anywhere but a segment boundary. We're working around it, but these > kinds of issues are likely to be common, and unlikely to be fixable > without some kind of user-obvious deviation from the spec like long > pipeline flushes.] > > Steve > >
Received on Monday, 18 June 2012 18:42:12 UTC