Re: [MSE] Updated MPEG-2 TS Byte Stream Requirements from Aaron Colwell on 2012-09-24 (public-html-media@w3.org from September 2012)

From: Aaron Colwell <acolwell@google.com>
Date: Mon, 24 Sep 2012 13:03:09 -0700
To: Bob Lund <B.Lund@cablelabs.com>
Cc: "public-html-media@w3.org" <public-html-media@w3.org>, Alex Giladi <alex.giladi@huawei.com>
Message-ID: <CAA0c1bDfv3m3Dwo4LCpva93a5cLDAX8y2CpK9H=2m-ckgkzUog@mail.gmail.com>
Hi Bob,

Thanks for the response. I've spent some quality(?) time with the MPEG2-TS
& DASH specs over the last couple of days so hopefully my comments will be
a little more informed this time around. :)

Comments inline...

On Mon, Sep 24, 2012 at 7:44 AM, Bob Lund <B.Lund@cablelabs.com> wrote:

> Hi Aaron,
>
> See in-line for responses to your questions from Alex and me.
>
> Thanks,
> Bob
>
> From: Aaron Colwell <acolwell@google.com>
> Date: Wednesday, September 19, 2012 7:33 PM
> To: Bob Lund <b.lund@cablelabs.com>
> Cc: "public-html-media@w3.org" <public-html-media@w3.org>
> Subject: Re: [MSE] Updated MPEG-2 TS Byte Stream Requirements
>
> Hi Bob,
>
> Thanks for doing this. Sorry it has taken me a little while to respond.
> I've got a few questions and hopefully they won't seem too silly since I
> have no experience with MPEG-2 TS. I'll blame the MPEG-2 TS Wikipedia
> articles if I get something totally wrong. ;)
>
> Many of these questions come from looking at the differences between the
> old proposal and your new version. My understanding of the old version
> mapped easily to my understanding of HLS. The new text signals to me that
> you are going for something more complex.
>
> Here are the questions that came to mind:
>
> 1. Why is PSI included in both media segments and initialization segments?
> All these tables look like initialization segment only material.
>
>
> PSI is required by MPEG-2 Systems to appear very frequently inband. So,
> every media segment longer than ~100ms will contain PSI.
>

[acolwell] So the problem here is that it would look like an initialization
segment appeared in the middle of a media segment, because the PSI might
appear in the middle of a GOP?


>
> 2. Why was the 'transport_error_indicator set to 0' requirement removed?
>  a. Why do we want to pass data we know is corrupt to the UA?
>  b. Why can't the service the web application is talking to simply strip
> these bad packets?
>
>
> The flag set means that the stream is broken. This should be taken care of
> at the fragmentor, before ever being segmented.
>
>

[acolwell] If the flag being set means that the stream is broken then we
should restore the requirement that this always be set to 0. This makes it
clear that the UA should signal an error if it encounters a TS packet with
this bit set.

>
> 3. How is the "self-initializing" media segment different from an
> initialization segment before a standard media segment? I don't really
> understand this distinction.
>
>
>  It’s same, but “self-initializing” implies having PSI prior to media
> packets in the media segment.
>

[acolwell] I think in the Media Source context this doesn't need to be
explicitly stated like it is in the DASH spec. In Media Source, the
initialization segment must come before any media segments so these two
scenarios don't look any different from the UA's point of view. The web
application doesn't have to append full segments at a time so it is
possible to append a self-initializing media segment in a way that looks
like an init segment followed by a normal media segment. I don't think the
text needs to treat these two situations separately.


>
> 4. How is the boundary between media segments detected? Does the Random
> Access indicator in the Adaptation Field mark the beginning of a new media
> segment?
>
>
> No. Segment boundary is unmarked; the random access indicator indicates
> the beginning of an AU that has SPS and PPS in front of it in the AVC case.
> So in our case the first packet carrying media data will have this bit on,
> but the opposite is not necessarily true.
>

[acolwell] So this could be a problem if there is no straight-forward way
to identify where the beginning and end of media segments are. I think we
need to come up with detailed rules for how to detect the beginning of a
media segment as defined in the Media Source spec. The DASH definitions
have the benefit of using files & byte ranges to delineate where segment
boundries are. These signals aren't present in Media Source since segments
can be appended in pieces.


>
> 5. What restrictions are put on the continuity counter & discontinuity
> indicator fields? It seems like these could confuse the UA if the web
> application tried to do "out of order" media segment appending. It seems to
> me that there should be no jumps in the counter or discontinuities inside a
> single media segment.
>
>
> We put none. We can explicitly disallow random jumps, but explicitly
> signaled discontinuities can stay – they are perfectly legal, and will
> arise if two parts of the segment were “glued” together from different
> sources.
>

[acolwell] So here is another case where I think being able to clearly
identify where media segment boundaries are. I could maybe see this working
out ok if the discontinuity happens within a single media segment, because
we have a larger unit that we can use to resolve the relative jump. If we
don't have a good definition for the boundaries then I think the
discontinuity could get confused with an out-of-order media segment and the
data would likely get placed in an unexpected part of the SourceBuffer. I
think there is a disconnect here between the Media Source append model and
MPEG2-TS's expected playout model. In Media Source discontinuities in the
timeline indicate out of order segment appending. In MPEG2-TS
discontinuities in the timeline are allowed, but don't imply out of order
media. The expectation is that playback will continue forward, just with
new PTS timestamp offsets applied to the new TS packets. This needs to be
resolved so that the situation isn't ambiguous to the UA.  Timestamp
rollover is a similar situation.


> 6. Can you provide a little more background about why "The media segment
> will not rely on initialization information in another media segment." was
> added? I don't really understand. This implies to me that the bytestream
> would have to constantly alternate between initialization segments and
> media segments.
>
>
> This implies that PSI appearing in the media segment describes the content
> of this segment entirely and completely. You don’t need to alternate, as
> the frequency of PSI repetition within the segment is fairly high.
>

[acolwell] Sorry. I don't think I was clear. I meant that the repeated PSI
could simply be interpreted by the UA as the initialization segment being
repeately interleaved with the media segments. The one situation where this
wouldn't make sense is when the PSI appears in the middle of a GOP. In that
case this data couldn't be considered an initialization segment, because
that would mean that the media data that followed would have to be the
start of a media segment which isn't possible because it wouldn't start
with a random access point. We just need to establish rules that allow the
UA to be able to detect when the PSI is intended to be an init segment and
when it is data that appears within a media segment.



> 7. What do expect a deployment that uses MPEG-2 TS as you describe w/ MSE
> to look like? I'm trying to get a sense of how the web application will
> fetch the media and what type of infrastructure you expect it to be talking
> to.
>
>
> One use case is MPEG DASH distribution using one of the two DASH MPEG-2 TS
> profiles. Another use case is HLS.
>

[acolwell] So in all these scenarios, the TS will be split into pieces and
hosted on web servers?

8. What are you trying to provide with MPEG-2 TS that can't be accomplished
> with the ISO BMFF?
>
>
> A considerable amount of video content in use today uses MPEG-2 TS.
>

[acolwell] Are we talking about web accessable content here? I know it is
popular in broadcast equipment, but I haven't seen much content on the web
aside from the occasional HLS content here and there. Can I get more
examples please? I'm just trying to understand what we are talking about
here.


>   a. What are the tradeoffs?
>
>
> Only supporting ISO BMFF means transmuxing this MPEG-2 TS content.
>

[acolwell] Why is that problematic? Isn't the TS content already being
transformed in some way to make it conform to these requirements? It seems
like at a minimum you'd want to remove all the redundant PSI since the
content is being transported over a lossless channel.


>   b. Why is it beneficial for UAs to take on the burden of supporting
> another format?
>
>
> Some UAs support MPEG-2 TS. Some UAs will be required to support MPEG-2
> TS. Defining the MPEG-2 TS byte stream requirements doesn't require a UA to
> support that format, does it?
>

[acolwell] Ok. I'm trying to understand whether these UAs will be typical
browsers or special cable operator specific applications. If it is the
later it might be better to have this specified in a cable spec. I'm trying
to understand what the minimal set of MPEG-2 TS that makes sense in a web
context. If there are further requirements needed by operators, that is
fine, but it isn't clear to me that the place for that is in a web spec.

You are correct that defining it doesn't mean UA's have to support it since
the plan was to put this in a non-normative section like ISO BMFF & WebM
are.


>
> I'd also like to request that the Encrypted Media Segments section be
> removed and the discussion of that part be moved to the EME spec work. I
> understand it is important, but I'd like to keep encryption topics outside
> of MSE. Your definitions for initialization segments appear to be broad
> enough to include the necessary information just like the WebM & ISO BMFF
> sections are.
>
>
> OK.
>

[acolwell] Thank you.

I appreciate your patience with all my questions. I'm really just trying to
understand the scope and complexity of what is being proposed and why it is
important to people.

Aaron
Received on Monday, 24 September 2012 20:05:07 UTC