- From: Aaron Colwell <acolwell@google.com>
- Date: Tue, 24 Jul 2012 16:42:02 -0700
- To: Kevin Streeter <kstreete@adobe.com>
- Cc: Mark Watson <watsonm@netflix.com>, "<public-html-media@w3.org>" <public-html-media@w3.org>
- Message-ID: <CAA0c1bAbxZ8Bgbdz3y7vX=4HrNfWnLsPwvc5SusLa0_Zy4v6Qg@mail.gmail.com>
Using reply all this time... :) ---------- Forwarded message ---------- From: Aaron Colwell <acolwell@google.com> Date: Tue, Jul 24, 2012 at 4:40 PM Subject: Re: [MSE] Establishing the Presentation Start Timestamp To: Kevin Streeter <kstreete@adobe.com> Hi Kevin, Yes. The app can either set an offset or just append and then 'videoTag.currentTime = videoTag.buffered.start(0)' which should contain the start time of the first segment. Aaron On Tue, Jul 24, 2012 at 4:37 PM, Kevin Streeter <kstreete@adobe.com> wrote: > Aaron, Mark,**** > > ** ** > > I’m a little unclear on how things will work for a live stream. The > user will typically start playback at some non-zero time, which represents > the “live” end of the stream. How does the timestamp offset account for > this? Does it requiring setting a negative offset that re-bases the stream > to 0 so that playback begins immediately?**** > > ** ** > > -K **** > > ** ** > > *From:* Mark Watson [mailto:watsonm@netflix.com] > *Sent:* Tuesday, July 24, 2012 2:19 PM > *To:* Aaron Colwell > *Cc:* <public-html-media@w3.org> > *Subject:* Re: [MSE] Establishing the Presentation Start Timestamp**** > > ** ** > > ** ** > > On Jul 24, 2012, at 2:13 PM, Aaron Colwell wrote:**** > > > > **** > > Hi Mark, **** > > ** ** > > Thanks for your comments. I too am starting to believe that we should just > have the SourceBuffer timelines start at 0 and NOT derive the presentation > start timestamp from the first segment appended. I agree that the timestamp > offset mechanism should be used to handle any content that doesn't already > start at 0. This might make things a little annoying for live streams, but > it is a one time operation at the beginning of playback to figure out what > the appropriate timestamp offset needs to be. I think the 90-99% case will > be with content that starts at 0 so we should optimize for that. If offsets > need to be applied the app needs to know about them or "discover" them by > appending a segment to a "scratch" SourceBuffer and see what > SourceBuffer.buffered reports.**** > > ** ** > > Right - and if we need to provide a "cleaner" way for the app to peek the > media timestamp we can do that later when the need is clearer.**** > > > > **** > > ** ** > > I do have a question about your first append at 10 minutes example. Are > you saying that you want the HTMLMediaElement to implicitly seek to the the > start of the first media segment appended? I think it might be less > surprising if default playback start position<http://dev.w3.org/html5/spec/media-elements.html#default-playback-start-position> stays > 0 and the app manually sets HTMLMedaElement.currentTime to > HTMLMediaElement.buffered.start(0) or the desired seek time if you know it. > **** > > ** ** > > Agreed - that's what I intended (even if that's not what I wrote ;-)**** > > > > **** > > Otherwise, I'm not sure how to make the behavior you desire fits into the > descriptions specified in the offsets into the media resource<http://dev.w3.org/html5/spec/media-elements.html#offsets-into-the-media-resource> section > of the HTML spec.**** > > ** ** > > Aaron**** > > ** ** > > On Wed, Jul 18, 2012 at 11:55 AM, Mark Watson <watsonm@netflix.com> wrote: > **** > > ** ** > > On Jul 18, 2012, at 11:03 AM, Aaron Colwell wrote:**** > > ** ** > > Hi Mark,**** > > ** ** > > Comments inline... **** > > ** ** > > On Thu, Jul 12, 2012 at 2:32 PM, Mark Watson <watsonm@netflix.com> wrote:* > *** > > ** ** > > ** ** > > **** > > 4. How close do the starting timestamps on the first media segments from > each SourceBuffer need to be? **** > > - In this example I've shown them to be only 30 milliseconds apart, but > would 0.5 seconds be acceptable? Would 2 seconds? **** > > - How much time do we allow here before we consider there to be missing > data and playback can't start? **** > > - What happens if the gap is too large?**** > > ** ** > > I think this is roughly the same question as 'what happens if I append a > video segment which starts X ms after the end of the last video segment' ? > **** > > ** ** > > if X <= one frame interval, this is definitely not a 'gap' and playback > continues smoothly. If X > 1 second this is definitely a gap and playback > should stall (in the same way as it does today on a network outage).**** > > ** ** > > For X values in between, I am not sure: implementations have to draw a > line somewhere. A gap of multiple frame intervals could occur when > switching frame rate. You might also get a couple of frame intervals gap > when switching if you do wacky things with frame reordering around segment > boundaries.**** > > ** ** > > When looking at differences between audio and video, we need to be > tolerant of differences as much as the larger of the audio frame size and > the video frame interval.**** > > ** ** > > if the gap is too large, this element just stays in the same state. > Perhaps I append video from 0s and audio from 2s and this is because my > network requests got re-ordered and any millisecond now I am going to > append the 0-2s audio. Playback should start when that 0-2s is appended. * > *** > > ** ** > > ** ** > > [acolwell] I agree. We need to come up with some spec text for this and > then we can then debate the merits of these various magic numbers. Care to > volunteer for this? :) **** > > ** ** > > Ok, assign me a bug.**** > > > > **** > > ** ** > > Any insights or suggestions would be greatly appreciated.**** > > ** ** > > We have the same problem with push/popTimeOffset. Suppose I want your > media above to appear at offset 200s in both audio and video source > buffers. What I really want is for the audio to start at 200s and the video > at 200.030ms.**** > > ** ** > > In this case the application knows better than the media what the internal > media times are. I know that the video segment has all the video from time > 0s, even though the first frame is at 30ms. I really want to provide the > actual offset to be applied to the internal timestamps, rather than > providing the source buffer time that the next segment should start at.*** > * > > ** ** > > [acolwel] One way I think we could get around this is to mandate that the > media segments actually have a start time of 0. In WebM there is a Cluster > timestamp and then all blocks are relative to this timestamp. If the > Cluster timestamp is 0 and the first frame in the cluster is at 30ms then > there is enough information for the UA to "do the right thing". I'm not > sure if a similar mechanism exists in ISO.**** > > ** ** > > Not really - the rather complex combination of decode times, composition > offsets and edit lists results in a presentation timestamp for each sample > on a global timeline (shared across all bitrates etc.). But if the > timestamp of the first sample is X there is nothing to say, for example, > "there are no other samples between time Y (< X) and X".**** > > > > **** > > The application that creates the demuxed files just need to make sure the > separate files both have the same segment start time. **** > > ** ** > > That's not always possible because of skew caused by audio frame durations > being different from video frame intervals.**** > > ** ** > > I think we need to say that all source buffers share a common global > timeline and that timestamps in the media segments must be mapped to that > in a way that is common across source buffers. This means any offset > applied to media internal timestamps needs to be the same across source > buffers. It means that establishing such offsets needs to be done > explicitly by the application or, if they are derived from timestamps in > the media it needs to be done in a consistent way (in terms of which out of > audio and video the time offset is taken from).**** > > ** ** > > I think this has implications for the push/pop time offset as well. They > should be global methods which establish a global offset based on the > next-appended segment(s).**** > > ** ** > > We do also need a way to handle the user starting content in the middle. > If I have a 30 min content item and the user wants to start at minute 10 > (because of a bookmark, say) then I should be able to start appending data > at position 10min in the source buffer timeline. The seek bar needs to show > the playback starting at minute 10 and if the user seeks backwards this > should be ok.**** > > ** ** > > pushOffset isn't right for this case because the media internal timestamps > are correct: the first segment appended really does start at timestamp > 10min.**** > > ** ** > > I wonder whether we should just say that the source buffer timestamp > starts at zero and not derive a start point from the appended media. If the > media internal timestamp corresponding to the start of the content is not > zero you need to explicitly handle this with a pushOffset call ?**** > > > > **** > > ** ** > > Applications could also just append the segments to a "scratch" > SourceBuffer to see what the initial timestamp is and then use that > information to compute the proper offset to apply. It's not the greatest > solution, but it does provide a way for people to handle this if they > aren't as careful about how they create their demuxed content.**** > > **** > > ** ** > > Aaron**** > > ** ** > > ** ** > > Hmm - no clear answer here - I'll think about this some more.**** > > ** ** > > …Mark**** > > > > **** > > ** ** > > Aaron**** > > ** ** > > ** ** > > ** ** > > ** ** > > ** ** > > ** ** >
Received on Tuesday, 24 July 2012 23:42:32 UTC