- From: Aaron Colwell <acolwell@google.com>
- Date: Wed, 8 Aug 2012 14:07:17 -0700
- To: Kevin Streeter <kstreete@adobe.com>
- Cc: "<public-html-media@w3.org>" <public-html-media@w3.org>
- Message-ID: <CAA0c1bCwFoUO1i23jDi0nhid=jWqMdsBct76r2naqVxz3Vwuuw@mail.gmail.com>
Hi Kevin, I just wanted to circle back with you to make sure my answer was satisfactory. If so then I'll start working on updating the spec so I can resolve Bug 18389 <https://www.w3.org/Bugs/Public/show_bug.cgi?id=18389>. Aaron On Tue, Jul 24, 2012 at 4:42 PM, Aaron Colwell <acolwell@google.com> wrote: > Using reply all this time... :) > > ---------- Forwarded message ---------- > From: Aaron Colwell <acolwell@google.com> > Date: Tue, Jul 24, 2012 at 4:40 PM > Subject: Re: [MSE] Establishing the Presentation Start Timestamp > To: Kevin Streeter <kstreete@adobe.com> > > > Hi Kevin, > > Yes. The app can either set an offset or just append and then > 'videoTag.currentTime = videoTag.buffered.start(0)' which should contain > the start time of the first segment. > > Aaron > > > On Tue, Jul 24, 2012 at 4:37 PM, Kevin Streeter <kstreete@adobe.com>wrote: > >> Aaron, Mark,**** >> >> ** ** >> >> I’m a little unclear on how things will work for a live stream. The >> user will typically start playback at some non-zero time, which represents >> the “live” end of the stream. How does the timestamp offset account for >> this? Does it requiring setting a negative offset that re-bases the stream >> to 0 so that playback begins immediately?**** >> >> ** ** >> >> -K **** >> >> ** ** >> >> *From:* Mark Watson [mailto:watsonm@netflix.com] >> *Sent:* Tuesday, July 24, 2012 2:19 PM >> *To:* Aaron Colwell >> *Cc:* <public-html-media@w3.org> >> *Subject:* Re: [MSE] Establishing the Presentation Start Timestamp**** >> >> ** ** >> >> ** ** >> >> On Jul 24, 2012, at 2:13 PM, Aaron Colwell wrote:**** >> >> >> >> **** >> >> Hi Mark, **** >> >> ** ** >> >> Thanks for your comments. I too am starting to believe that we should >> just have the SourceBuffer timelines start at 0 and NOT derive the >> presentation start timestamp from the first segment appended. I agree that >> the timestamp offset mechanism should be used to handle any content that >> doesn't already start at 0. This might make things a little annoying for >> live streams, but it is a one time operation at the beginning of playback >> to figure out what the appropriate timestamp offset needs to be. I think >> the 90-99% case will be with content that starts at 0 so we should optimize >> for that. If offsets need to be applied the app needs to know about them or >> "discover" them by appending a segment to a "scratch" SourceBuffer and see >> what SourceBuffer.buffered reports.**** >> >> ** ** >> >> Right - and if we need to provide a "cleaner" way for the app to peek the >> media timestamp we can do that later when the need is clearer.**** >> >> >> >> **** >> >> ** ** >> >> I do have a question about your first append at 10 minutes example. Are >> you saying that you want the HTMLMediaElement to implicitly seek to the the >> start of the first media segment appended? I think it might be less >> surprising if default playback start position<http://dev.w3.org/html5/spec/media-elements.html#default-playback-start-position> stays >> 0 and the app manually sets HTMLMedaElement.currentTime to >> HTMLMediaElement.buffered.start(0) or the desired seek time if you know it. >> **** >> >> ** ** >> >> Agreed - that's what I intended (even if that's not what I wrote ;-)**** >> >> >> >> **** >> >> Otherwise, I'm not sure how to make the behavior you desire fits into the >> descriptions specified in the offsets into the media resource<http://dev.w3.org/html5/spec/media-elements.html#offsets-into-the-media-resource> section >> of the HTML spec.**** >> >> ** ** >> >> Aaron**** >> >> ** ** >> >> On Wed, Jul 18, 2012 at 11:55 AM, Mark Watson <watsonm@netflix.com> >> wrote:**** >> >> ** ** >> >> On Jul 18, 2012, at 11:03 AM, Aaron Colwell wrote:**** >> >> ** ** >> >> Hi Mark,**** >> >> ** ** >> >> Comments inline... **** >> >> ** ** >> >> On Thu, Jul 12, 2012 at 2:32 PM, Mark Watson <watsonm@netflix.com> wrote: >> **** >> >> ** ** >> >> ** ** >> >> **** >> >> 4. How close do the starting timestamps on the first media segments from >> each SourceBuffer need to be? **** >> >> - In this example I've shown them to be only 30 milliseconds apart, but >> would 0.5 seconds be acceptable? Would 2 seconds? **** >> >> - How much time do we allow here before we consider there to be missing >> data and playback can't start? **** >> >> - What happens if the gap is too large?**** >> >> ** ** >> >> I think this is roughly the same question as 'what happens if I append a >> video segment which starts X ms after the end of the last video segment' ? >> **** >> >> ** ** >> >> if X <= one frame interval, this is definitely not a 'gap' and playback >> continues smoothly. If X > 1 second this is definitely a gap and playback >> should stall (in the same way as it does today on a network outage).**** >> >> ** ** >> >> For X values in between, I am not sure: implementations have to draw a >> line somewhere. A gap of multiple frame intervals could occur when >> switching frame rate. You might also get a couple of frame intervals gap >> when switching if you do wacky things with frame reordering around segment >> boundaries.**** >> >> ** ** >> >> When looking at differences between audio and video, we need to be >> tolerant of differences as much as the larger of the audio frame size and >> the video frame interval.**** >> >> ** ** >> >> if the gap is too large, this element just stays in the same state. >> Perhaps I append video from 0s and audio from 2s and this is because my >> network requests got re-ordered and any millisecond now I am going to >> append the 0-2s audio. Playback should start when that 0-2s is appended. >> **** >> >> ** ** >> >> ** ** >> >> [acolwell] I agree. We need to come up with some spec text for this and >> then we can then debate the merits of these various magic numbers. Care to >> volunteer for this? :) **** >> >> ** ** >> >> Ok, assign me a bug.**** >> >> >> >> **** >> >> ** ** >> >> Any insights or suggestions would be greatly appreciated.**** >> >> ** ** >> >> We have the same problem with push/popTimeOffset. Suppose I want your >> media above to appear at offset 200s in both audio and video source >> buffers. What I really want is for the audio to start at 200s and the video >> at 200.030ms.**** >> >> ** ** >> >> In this case the application knows better than the media what the >> internal media times are. I know that the video segment has all the video >> from time 0s, even though the first frame is at 30ms. I really want to >> provide the actual offset to be applied to the internal timestamps, rather >> than providing the source buffer time that the next segment should start at. >> **** >> >> ** ** >> >> [acolwel] One way I think we could get around this is to mandate that the >> media segments actually have a start time of 0. In WebM there is a Cluster >> timestamp and then all blocks are relative to this timestamp. If the >> Cluster timestamp is 0 and the first frame in the cluster is at 30ms then >> there is enough information for the UA to "do the right thing". I'm not >> sure if a similar mechanism exists in ISO.**** >> >> ** ** >> >> Not really - the rather complex combination of decode times, composition >> offsets and edit lists results in a presentation timestamp for each sample >> on a global timeline (shared across all bitrates etc.). But if the >> timestamp of the first sample is X there is nothing to say, for example, >> "there are no other samples between time Y (< X) and X".**** >> >> >> >> **** >> >> The application that creates the demuxed files just need to make sure the >> separate files both have the same segment start time. **** >> >> ** ** >> >> That's not always possible because of skew caused by audio frame >> durations being different from video frame intervals.**** >> >> ** ** >> >> I think we need to say that all source buffers share a common global >> timeline and that timestamps in the media segments must be mapped to that >> in a way that is common across source buffers. This means any offset >> applied to media internal timestamps needs to be the same across source >> buffers. It means that establishing such offsets needs to be done >> explicitly by the application or, if they are derived from timestamps in >> the media it needs to be done in a consistent way (in terms of which out of >> audio and video the time offset is taken from).**** >> >> ** ** >> >> I think this has implications for the push/pop time offset as well. They >> should be global methods which establish a global offset based on the >> next-appended segment(s).**** >> >> ** ** >> >> We do also need a way to handle the user starting content in the middle. >> If I have a 30 min content item and the user wants to start at minute 10 >> (because of a bookmark, say) then I should be able to start appending data >> at position 10min in the source buffer timeline. The seek bar needs to show >> the playback starting at minute 10 and if the user seeks backwards this >> should be ok.**** >> >> ** ** >> >> pushOffset isn't right for this case because the media internal >> timestamps are correct: the first segment appended really does start at >> timestamp 10min.**** >> >> ** ** >> >> I wonder whether we should just say that the source buffer timestamp >> starts at zero and not derive a start point from the appended media. If the >> media internal timestamp corresponding to the start of the content is not >> zero you need to explicitly handle this with a pushOffset call ?**** >> >> >> >> **** >> >> ** ** >> >> Applications could also just append the segments to a "scratch" >> SourceBuffer to see what the initial timestamp is and then use that >> information to compute the proper offset to apply. It's not the greatest >> solution, but it does provide a way for people to handle this if they >> aren't as careful about how they create their demuxed content.**** >> >> **** >> >> ** ** >> >> Aaron**** >> >> ** ** >> >> ** ** >> >> Hmm - no clear answer here - I'll think about this some more.**** >> >> ** ** >> >> …Mark**** >> >> >> >> **** >> >> ** ** >> >> Aaron**** >> >> ** ** >> >> ** ** >> >> ** ** >> >> ** ** >> >> ** ** >> >> ** ** >> > >
Received on Wednesday, 8 August 2012 21:07:46 UTC