Re: [MSE] Establishing the Presentation Start Timestamp from Aaron Colwell on 2012-07-18 (public-html-media@w3.org from July 2012)

From: Aaron Colwell <acolwell@google.com>
Date: Wed, 18 Jul 2012 10:00:57 -0700
To: Kevin Streeter <kstreete@adobe.com>
Cc: "<public-html-media@w3.org>" <public-html-media@w3.org>
Message-ID: <CAA0c1bCiaTPCeWqM=q8rNbbES6eo7L8KSctj98jLwS9-pPH27w@mail.gmail.com>
 Kevin,

If I understand you correctly, you are proposing that a "preroll" state be
added that prevents a transition to HAVE_METADATA, but allows media
segments to be appended. Once the web application determines that it has
appended enough segments for the start time to be determined, it would call
a method to transition out of the "preroll" state and then the MediaSource
could determine the start time and allow the HTMLMediaElement.readyState to
transition to HAVE_METADATA. Is this correct?

I believe it should be relatively easy to modify the MediaSource object to
do this. If people are ok with this idea I can draft up some IDL and text
to give people a feel for what this could look like.

Aaron




On Thu, Jul 12, 2012 at 11:06 AM, Kevin Streeter <kstreete@adobe.com> wrote:

> Aaron,****
>
> ** **
>
>   This is a tricky one, we encountered a similar situation dealing with
> unmixed content in Flash.  The problem is really that at startup time,
> before you have anything in the playback buffer, you have an ambiguous
> situation when you push the first segment (for the first track).  Will
> there be another track?  More segments for that single track?  There is no
> way for the UA to know unless the application tells it.****
>
> ** **
>
>   One possible way to deal with this is to have a pre-roll state where you
> can push segments without playback starting.  You would resolve the time
> issues only after playback actually started.  This would allow the
> application to initialize all tracks.****
>
> ** **
>
>   Another way would be to assume that all **initialized** tracks have at
> least some data.  So if you create a track programmatically, or push in an
> initialization segment, the UA will wait to start playback until at least
> some content for that track arrives.****
>
> ** **
>
>   The first mechanism is more explicit (which IMO is generally better),
> but would also probably mean changes in the playback state machine.****
>
> ** **
>
> -K****
>
> ** **
>
> ** **
>
> *From:* Aaron Colwell [mailto:acolwell@google.com]
> *Sent:* Thursday, July 12, 2012 10:28 AM
> *To:* <public-html-media@w3.org>
> *Subject:* [MSE] Establishing the Presentation Start Timestamp****
>
> ** **
>
> Hi,****
>
> ** **
>
> While doing some testing with demultiplexed content that uses separate
> SourceBuffers for the audio & video streams, we ran into some issues around
> establishing the presentation start timestamp that I don't think are
> covered well in the existing spec text.****
>
> ** **
>
> Section 6.1.3<http://dvcs.w3.org/hg/html-media/raw-file/tip/media-source/media-source.html#webm-start-timestamp> for
> WebM states :****
>
> The timestamp in the first block of the first media segment appended
> establishes the starting timestamp for the presentation timeline. All media
> segments appended after this first segment are expected to have timestamps
> greater than or equal to this timestamp.****
>
> ** **
>
> Section 6.2.3<http://dvcs.w3.org/hg/html-media/raw-file/tip/media-source/media-source.html#iso-start-timestamp> has
> similar text for ISO.****
>
> ** **
>
> This language is pretty straightforward if we are only dealing with a
> single SourceBuffer. When more than one SourceBuffer is involved things get
> a little more tricky when the first media segment for each SourceBuffer
> don't start with the same timestamp.****
>
> ** **
>
> Say I have an audio stream that starts at timestamp 0, and the video
> stream starts at 30 milliseconds. If I follow the existing language very
> strictly, then whichever stream appends a media segment first establishes
> the presentation start time. This means that I can either have a start time
> of 0 or 30 miliseconds. This raises several questions that I think need to
> be discussed.****
>
> ** **
>
> 1. Should we expect the web application to be aware of this situation and
> always ensure that the earliest segment gets appended first?****
>
> ** **
>
> 2. Should we wait until the first media segments are appended to all
> SourceBuffers in MediaSource.activeSourceBuffers before determining the
> start time and then simply take the earliest timestamp?****
>
> ** **
>
> 3. If a media segment is appended that starts before the established
> presentation start time and continues past it, how should we handle that?
>  ****
>
>   - Should this trigger an error?****
>
>   - Should it be treated like an end overlap<http://dvcs.w3.org/hg/html-media/raw-file/tip/media-source/media-source.html#source-buffer-overlap-end> where
> the presentation start time acts like the end of a range already in the
> buffer? This would essentially keep everything after the first random
> access point that has a timestamp >= the presentation start timestamp.****
>
> ** **
>
> 4. How close do the starting timestamps on the first media segments from
> each SourceBuffer need to be? ****
>
>   - In this example I've shown them to be only 30 milliseconds apart, but
> would 0.5 seconds be acceptable? Would 2 seconds? ****
>
>   - How much time do we allow here before we consider there to be missing
> data and playback can't start? ****
>
>   - What happens if the gap is too large?****
>
> ** **
>
> Any insights or suggestions would be greatly appreciated.****
>
> ** **
>
> Aaron****
>
> ** **
>
Received on Wednesday, 18 July 2012 17:01:33 UTC