[media-source] Consider refining "3.5.8 Coded Frame Processing" Step 1.14 part 2 for multiple frame per MediaSample (#269)

aToshioOgasawara has just created a new issue for https://github.com/w3c/media-source:

== Consider refining "3.5.8 Coded Frame Processing" Step 1.14 part 2 for multiple frame per MediaSample ==
I'm encountering a MediaSample leak issue on WebKit, which was filed as a WebKit Bug ticket [Bz222448](https://bugs.webkit.org/show_bug.cgi?id=222448).

When overlapped MediaSamples are composed of multiple frame data with DTS order and the following Condition A) or Condition B) is valid, the overlapped MediaSamples are not deleted; such MediaSamples are leaked.

* Condition A): A MediaSample's PTS + its duration exceeds the next MediaSample's PTS
* Condition B): The I-Frame start position of a newly appended MediaSample doesn't match the I-Frame start position of a MediaSample which is already buffered.

I think the isuue comes from an assumption that MediaSamples can handle plural frames but the eviction process seemingly expects that a MediaSample should be linked to one frame. To make the SourceBuffer eviction algorithm more general, the eviction process should check the end of MediaSample instead of presentation timestamp (PTS).

In the patch attached in Bz222448, it changes a description of "3.5.8 Coded Frame Processing" Step 1.14 part2.

* Current: If highest presentation timestamp for track buffer is set and less than or equal to presentation timestamp
* Updated: If highest presentation timestamp for track buffer is set and less than frame end timestamp

The change is that "or equal to presentation" is replaced with "frame end".

I got an advice in Bz222448 from an Apple engineer as follows:

>It looks like we do correctly set "highestPresentationTimestamp" to the "frame end timestamp" later in step 1.19, but I do wonder if there were other changes to the specification around "frame end timestamp" that may have been missed. So your proposed change would be a willful departure from the text of the specification. Should this be brought up to the MSE spec authors first? Or is there another way to solve the issue you're attempting to fix?

I think of the descrpiotion change is necessary for generalizing the "3.5.8 Coded Frame Processing" though, there might be other relevant part to be concerned by following this change. I'd like to discuss the possibility of this change and keeping compatibility to relevant MSE specifications.

Reference:
- [Bz222448](https://bugs.webkit.org/show_bug.cgi?id=222448) [MSE] Overlapping MediaSamples are not deleted
- ["3.5.8 Coded Frame Processing" Step 1.14 part2](https://www.w3.org/TR/media-source/#sourcebuffer-coded-frame-processing)

Please view or discuss this issue at https://github.com/w3c/media-source/issues/269 using your GitHub account


-- 
Sent via github-notify-ml as configured in https://github.com/w3c/github-notify-ml-config

Received on Thursday, 18 March 2021 06:12:57 UTC