Re: [MSE] New Proposal for Bug 20901 from Aaron Colwell on 2013-05-14 (public-html-media@w3.org from May 2013)

From: Aaron Colwell <acolwell@google.com>
Date: Tue, 14 May 2013 07:10:52 -0700
To: Cyril Concolato <cyril.concolato@telecom-paristech.fr>
Cc: "<public-html-media@w3.org>" <public-html-media@w3.org>
Message-ID: <CAA0c1bD3UOGumWyMn1kVtbSQ5E-OjT9M3JY6An1UxDXUW8T7nw@mail.gmail.com>
Hi Cyril,

Comments inline


On Tue, May 14, 2013 at 6:32 AM, Cyril Concolato <
cyril.concolato@telecom-paristech.fr> wrote:

> Hi Aaron,
>
> Le 02/05/2013 00:39, Aaron Colwell a écrit :
>
>> At the F2F meeting last week I agreed to craft a new proposal for
>> handling out-of-order appends, appends w/o internal timestamp knowledge,
>> and discontinuities in MPEG2 TS. I'm going to start out by outlining the
>> relevant use cases as I understand them and then propose the new solution
>> that I believe will address all these use cases and avoid the issues with
>> the current solution.
>>
>> *_Use Cases:_*
>> *Use Case A: Out-of-order media segment appends.*
>>  Formats like ISOBMFFand WebMhave self-contained media segments with
>> clear start and end boundaries. Applications should be able to append these
>> segments in any order. These segments can be split across many appendXXX()
>> calls without any ambiguity because all coded frames within a media segment
>> are guaranteed to be adjacent in the presentation.
>>
> There seems to be an assumption here that when doing out-of-order appends
> the web application has no notion of numbering of segments. In adaptive
> streaming solutions (at least DASH, HLS) and in peer-to-peer solutions (to
> the best of my knowledge), you have an index number per segment. Could you
> clarify the use case?
>

Yes. MSE ignores the segment numbering because it can be unreliable for
content that comes from multiple sources. Content from different bitrates
don't necessarily have the same number of segments. The application
shouldn't need to care about these numbers. It is also possible for a
application to be doing out-of-order appends w/o it knowing it. An
application could be receiving a sequence of segments over a WebSocket
without there needing to be markers for where segments start and end. The
sender can order the segments however it wants without burdening the
application with segment boundary info. Paying attention to the segment
index numbering would just make thnigs more complicated for the application.


>
>> *Use Case B: Appending media into a continuous sequence w/o knowledge of
>> internal timestamps.*
>>  Some applications want to create a presentation by concatenating media
>> segments from different sources without knowledge of the timestamps inside
>> the segments. Each media segment appended should be placed, in the
>> presentation timeline, immediately after the previously appended segment
>> independent of what the internal timestamps are. At the beginning of each
>> media segment, a new timestampOffsetvalue is calculated so that the
>> timestamps in the media segment will get mapped to timestamps that
>> immediately follow the end of the previous media segment.
>>
>> *Use Case C: Place media at a specific location in the timeline w/o
>> knowledge of internal timestamps.*
>>
>>  This is related to Use Case B. This case is useful for placing media
>> segments from a third party in the middle of a presentation. It also allows
>> an application that receives media segments from a live source to easily
>> map the first segment received to presentation time 0.
>> *
>> *
>> *Use Case D: Handling of MPEG2-TS discontinuities without application
>> intervention.*
>>  MPEG2-TS streams are allowed to have timestamp discontinuities that can
>> make it difficult for the UAto detect out-of-order appends when the
>> discontinuity is split across two appendXXX() calls. The web application
>> likely doesn't know where in the bytestreamthese discontinuities occur so
>> MSEneeds to provide a mode where a sequence of appendXXX() calls are always
>> considered to be adjacent. In this mode the application needs to provide an
>> explicit signal, like an abort() call, to indicate that the next
>> appendXXX() is not adjacent to the previous calls.
>>
> Also, the current approach assumes that the application does not know what
> is a segment, but this may not be the major use case. In case the
> application knows which bytes start and end the segment, it could inform
> the media engine and the media engine would know how to deal with
> discontinuity (even in out of order appends, unless the discontinuity
> starts the segment). Could you tell us the use cases you have in mind where
> you would do out-of-order appends without knowing where the segment starts
> or ends?


I think it is very useful to allow applications to be built without
explicit knowledge of segment boundaries. This may not be common right now
for DASH & HLS use cases, but I believe there will be applications in the
future that will stream content to clients via RTCPeerConnection or
WebSockets and the sender won't want to burden the application with segment
boundary details. Out-of-order appends could be used to backfill low
quality sections of the presentation w/o having to notify the application.
I don't think providing an explicit way for the application to mark segment
boundaries would actually improve things. I believe it is much more likely
that developers would use these calls incorrectly.

Another think to keep in mind is that in the case of MPEG2-TS the concept
of a segment is a little fuzzy. There are no clear start and end markers.
You just have a set of TS packets that are grouped together in a file that
has no special start or end marker. Since you don't have to append a whole
segment at one time, this leads to an ambiguity between discontinuities and
out of order appends. In MPEG2-TS you don't ever really know when the
segment "ends" and a seek could happen at any time so the possibility of an
out-of-order append is alway present.  The new solution in the current spec
prevents the application from having to worry about where discontinuities
and segment boundaries are for MPEG2-TS. The application just puts the
SourceBuffer into 'sequence' mode and starts appending.


Aaron


>
>
> Cyril
>
> --
> Cyril Concolato
> Maître de Conférences/Associate Professor
> Groupe Multimedia/Multimedia Group
> Telecom ParisTech
> 46 rue Barrault
> 75 013 Paris, France
> http://concolato.wp.mines-**telecom.fr/<http://concolato.wp.mines-telecom.fr/>
>
>
>
Received on Tuesday, 14 May 2013 14:11:25 UTC