Re: [MSE] Updated MPEG-2 TS Byte Stream Requirements from Bob Lund on 2012-09-24 (public-html-media@w3.org from September 2012)

From: Bob Lund <B.Lund@CableLabs.com>
Date: Mon, 24 Sep 2012 14:13:03 -0600
To: Aaron Colwell <acolwell@google.com>
CC: "public-html-media@w3.org" <public-html-media@w3.org>, Alex Giladi <alex.giladi@huawei.com>
Message-ID: <CC86195F.20E8A%b.lund@cablelabs.com>
Hi Aaron,

WRT to specs, I feel your pain. See comment inline.

Bob

From: Aaron Colwell <acolwell@google.com<mailto:acolwell@google.com>>
Date: Monday, September 24, 2012 2:03 PM
To: Bob Lund <b.lund@cablelabs.com<mailto:b.lund@cablelabs.com>>
Cc: "public-html-media@w3.org<mailto:public-html-media@w3.org>" <public-html-media@w3.org<mailto:public-html-media@w3.org>>, Alex Giladi <alex.giladi@huawei.com<mailto:alex.giladi@huawei.com>>
Subject: Re: [MSE] Updated MPEG-2 TS Byte Stream Requirements

Hi Bob,

Thanks for the response. I've spent some quality(?) time with the MPEG2-TS & DASH specs over the last couple of days so hopefully my comments will be a little more informed this time around. :)

Comments inline...

On Mon, Sep 24, 2012 at 7:44 AM, Bob Lund <B.Lund@cablelabs.com<mailto:B.Lund@cablelabs.com>> wrote:
Hi Aaron,

See in-line for responses to your questions from Alex and me.

Thanks,
Bob

From: Aaron Colwell <acolwell@google.com<mailto:acolwell@google.com>>
Date: Wednesday, September 19, 2012 7:33 PM
To: Bob Lund <b.lund@cablelabs.com<mailto:b.lund@cablelabs.com>>
Cc: "public-html-media@w3.org<mailto:public-html-media@w3.org>" <public-html-media@w3.org<mailto:public-html-media@w3.org>>
Subject: Re: [MSE] Updated MPEG-2 TS Byte Stream Requirements

Hi Bob,

Thanks for doing this. Sorry it has taken me a little while to respond. I've got a few questions and hopefully they won't seem too silly since I have no experience with MPEG-2 TS. I'll blame the MPEG-2 TS Wikipedia articles if I get something totally wrong. ;)

Many of these questions come from looking at the differences between the old proposal and your new version. My understanding of the old version mapped easily to my understanding of HLS. The new text signals to me that you are going for something more complex.

Here are the questions that came to mind:

1. Why is PSI included in both media segments and initialization segments? All these tables look like initialization segment only material.

PSI is required by MPEG-2 Systems to appear very frequently inband. So, every media segment longer than ~100ms will contain PSI.

[acolwell] So the problem here is that it would look like an initialization segment appeared in the middle of a media segment, because the PSI might appear in the middle of a GOP?

I think Alex has deeper understanding here than I, but I view the PSI as information that will be in an initialization segment but will also be in media segments. So, maybe in MPEG-2 TS there is no need for initialization segments, only self-initializng media segments?



2. Why was the 'transport_error_indicator set to 0' requirement removed?
 a. Why do we want to pass data we know is corrupt to the UA?
 b. Why can't the service the web application is talking to simply strip these bad packets?

The flag set means that the stream is broken. This should be taken care of at the fragmentor, before ever being segmented.


[acolwell] If the flag being set means that the stream is broken then we should restore the requirement that this always be set to 0. This makes it clear that the UA should signal an error if it encounters a TS packet with this bit set.

3. How is the "self-initializing" media segment different from an initialization segment before a standard media segment? I don't really understand this distinction.

 It’s same, but “self-initializing” implies having PSI prior to media packets in the media segment.

[acolwell] I think in the Media Source context this doesn't need to be explicitly stated like it is in the DASH spec. In Media Source, the initialization segment must come before any media segments so these two scenarios don't look any different from the UA's point of view. The web application doesn't have to append full segments at a time so it is possible to append a self-initializing media segment in a way that looks like an init segment followed by a normal media segment. I don't think the text needs to treat these two situations separately.


4. How is the boundary between media segments detected? Does the Random Access indicator in the Adaptation Field mark the beginning of a new media segment?

No. Segment boundary is unmarked; the random access indicator indicates the beginning of an AU that has SPS and PPS in front of it in the AVC case. So in our case the first packet carrying media data will have this bit on, but the opposite is not necessarily true.

[acolwell] So this could be a problem if there is no straight-forward way to identify where the beginning and end of media segments are. I think we need to come up with detailed rules for how to detect the beginning of a media segment as defined in the Media Source spec. The DASH definitions have the benefit of using files & byte ranges to delineate where segment boundries are. These signals aren't present in Media Source since segments can be appended in pieces.


5. What restrictions are put on the continuity counter & discontinuity indicator fields? It seems like these could confuse the UA if the web application tried to do "out of order" media segment appending. It seems to me that there should be no jumps in the counter or discontinuities inside a single media segment.

We put none. We can explicitly disallow random jumps, but explicitly signaled discontinuities can stay – they are perfectly legal, and will arise if two parts of the segment were “glued” together from different sources.

[acolwell] So here is another case where I think being able to clearly identify where media segment boundaries are. I could maybe see this working out ok if the discontinuity happens within a single media segment, because we have a larger unit that we can use to resolve the relative jump. If we don't have a good definition for the boundaries then I think the discontinuity could get confused with an out-of-order media segment and the data would likely get placed in an unexpected part of the SourceBuffer. I think there is a disconnect here between the Media Source append model and MPEG2-TS's expected playout model. In Media Source discontinuities in the timeline indicate out of order segment appending. In MPEG2-TS discontinuities in the timeline are allowed, but don't imply out of order media. The expectation is that playback will continue forward, just with new PTS timestamp offsets applied to the new TS packets. This needs to be resolved so that the situation isn't ambiguous to the UA.  Timestamp rollover is a similar situation.

6. Can you provide a little more background about why "The media segment will not rely on initialization information in another media segment." was added? I don't really understand. This implies to me that the bytestream would have to constantly alternate between initialization segments and media segments.

This implies that PSI appearing in the media segment describes the content of this segment entirely and completely. You don’t need to alternate, as the frequency of PSI repetition within the segment is fairly high.

[acolwell] Sorry. I don't think I was clear. I meant that the repeated PSI could simply be interpreted by the UA as the initialization segment being repeately interleaved with the media segments. The one situation where this wouldn't make sense is when the PSI appears in the middle of a GOP. In that case this data couldn't be considered an initialization segment, because that would mean that the media data that followed would have to be the start of a media segment which isn't possible because it wouldn't start with a random access point. We just need to establish rules that allow the UA to be able to detect when the PSI is intended to be an init segment and when it is data that appears within a media segment.


7. What do expect a deployment that uses MPEG-2 TS as you describe w/ MSE to look like? I'm trying to get a sense of how the web application will fetch the media and what type of infrastructure you expect it to be talking to.

One use case is MPEG DASH distribution using one of the two DASH MPEG-2 TS profiles. Another use case is HLS.

[acolwell] So in all these scenarios, the TS will be split into pieces and hosted on web servers?

8. What are you trying to provide with MPEG-2 TS that can't be accomplished with the ISO BMFF?

A considerable amount of video content in use today uses MPEG-2 TS.

[acolwell] Are we talking about web accessable content here? I know it is popular in broadcast equipment, but I haven't seen much content on the web aside from the occasional HLS content here and there. Can I get more examples please? I'm just trying to understand what we are talking about here.

  a. What are the tradeoffs?

Only supporting ISO BMFF means transmuxing this MPEG-2 TS content.

[acolwell] Why is that problematic? Isn't the TS content already being transformed in some way to make it conform to these requirements? It seems like at a minimum you'd want to remove all the redundant PSI since the content is being transported over a lossless channel.

  b. Why is it beneficial for UAs to take on the burden of supporting another format?

Some UAs support MPEG-2 TS. Some UAs will be required to support MPEG-2 TS. Defining the MPEG-2 TS byte stream requirements doesn't require a UA to support that format, does it?

[acolwell] Ok. I'm trying to understand whether these UAs will be typical browsers or special cable operator specific applications. If it is the later it might be better to have this specified in a cable spec. I'm trying to understand what the minimal set of MPEG-2 TS that makes sense in a web context. If there are further requirements needed by operators, that is fine, but it isn't clear to me that the place for that is in a web spec.

You are correct that defining it doesn't mean UA's have to support it since the plan was to put this in a non-normative section like ISO BMFF & WebM are.


I'd also like to request that the Encrypted Media Segments section be removed and the discussion of that part be moved to the EME spec work. I understand it is important, but I'd like to keep encryption topics outside of MSE. Your definitions for initialization segments appear to be broad enough to include the necessary information just like the WebM & ISO BMFF sections are.

OK.

[acolwell] Thank you.

I appreciate your patience with all my questions. I'm really just trying to understand the scope and complexity of what is being proposed and why it is important to people.

Aaron
Received on Monday, 24 September 2012 20:14:32 UTC