Re: TextTrack questions

On Thu, Jan 23, 2014 at 1:13 AM, Aaron Colwell <acolwell@google.com> wrote:
> Comments inline...
>
>
> On Tue, Jan 21, 2014 at 7:06 PM, Philip Jägenstedt <philipj@opera.com>
> wrote:
>>
>> On Wed, Jan 22, 2014 at 3:31 AM, Aaron Colwell <acolwell@google.com>
>> wrote:
>> >
>> > On Tue, Jan 21, 2014 at 8:05 AM, Eric Carlson <eric.carlson@apple.com>
>> > wrote:
>> >>
>> >>
>> >> On Jan 17, 2014, at 6:38 PM, Philip Jägenstedt <philipj@opera.com>
>> >> wrote:
>> >>
>> >> > I hadn't thought about those case either. It would be nice (simple)
>> >> > to
>> >> > just say that for all in-band tracks, when the buffered ranges are
>> >> > updated, evict all cues which now do not overlap the buffered ranges.
>> >> > Is that going to break something?
>> >> >
>> >>
>> >>   That does seem like the simplest (to implement and understand))
>> >> solution
>> >> to this issue.
>> >>
>> >>   I don’t know of anything that it will break.
>> >
>> >
>> > I have a few questions about this:
>> >
>> > 1. Does "evict all cues" include ones added by JavaScript or just the
>> > ones
>> > added by the inband track?
>>
>> All cues. If you don't want to lose them, put them in a script-created
>> track or keep them around to re-insert later when that regions becomes
>> buffered again.
>
>
> Agreed.
>
>>
>>
>> > 2. For out-of-band tracks is the application expected to remove cues
>> > from
>> > unbuffered regions? I'm mainly thinking about a live broadcast where
>> > cues
>> > are inserted by JavaScript instead of in-band.
>>
>> I don't think we should throw away cues for out-of-band or
>> script-created tracks, no. The application will have to throw them
>> away if it's an infinite stream.
>
>
> Makes sense to me. I just wanted to double check.
>
>>
>>
>> > 3. I'm assuming "when the buffered ranges are updated" is evaluated on a
>> > per
>> > track basis and not actually what the HTMLMediaElement exposes since
>> > removal
>> > of text track data does not necessarily imply removal of audio/video
>> > data.
>> > Am I understanding the intended meaning correctly?
>>
>> I did mean HTMLMediaElement.buffered. AudioTrack/VideoTrack/TextTrack
>> don't expose buffered ranges, so I'm not sure what the alternative
>> would be. Do you mean something with SourceBuffer.buffered or some
>> spec-internal concept I'm forgetting?
>
>
> The changes I'm talking about would not necessarily be script visible
> because they are replacing existing buffered data in the presentation. This
> would likely be done atomically from JavaScript's perspective so script
> would have no opportunity to observe a hole in buffered. Also because text
> tracks are discontinuous they don't effect buffered ranges in the same way
> that audio or video would. For example if I have a time region that contains
> audio & a few cues and I overlap that with a media segment that only
> contains audio, the cues should be removed, but no hole in the buffered
> ranges would appear because the new data completely replaces the old data.
>
> Because text tracks are discontinuous, they can't really effect the buffered
> ranges in the way that audio & video do. If we did a simple intersection of
> all buffered ranges across tracks the result would only show buffered data
> where cues are. That is clearly wrong. In Chromium's MSE implementation we
> treat inband text track buffered ranges as a single range between
> [0-duration) so that simple intersection with the audio & video tracks
> buffered ranges will show the actual buffered ranges that are playable.
> Given this, any changes to an in-band text track won't effect the buffered
> ranges eventhough the actual cues in the presentation have changed.
>
>>
>>
>> > 4. MSE overlapping appends should be treated as a time range becoming
>> > "unbuffered", so old cues are removed, and then "buffered" with the new
>> > cues
>> > being appended. Right?
>>
>> So, assuming that an overlapping append causes
>> HTMLMediaElement.buffered to modified at a time when scripts can
>> observe it, yes, there would be an opportunity to evict cues.
>> Basically, I think the HTML spec should have a hook that is run
>> anytime that a script accessing .buffered would see something
>> different than it did before that point. I haven't checked if that's
>> implementable in an efficient way though, if one calculates the
>> buffered ranges lazily on request then we have a problem...
>
>
> As I've outlined above, I don't think we can simply restrict this to
> observable changes in buffered. I do think it would be handy to have a hook
> in the HTML spec that can be invoked when a buffered time range has been
> removed from the presentation. That would at least allow there to be a step
> in the MSE spec which triggers the removal of old cues when an overlapping
> append occurs. There could also be language in there about the buffered
> attribute being updated, but the MSE spec overloads the default buffered
> behavior so it wouldn't apply to MSE.
>
> I hope this helps. Thanks for helping me figure this stuff out. :)

It helped indeed, I didn't remember that MSE could throw away data by
an overlapping append so that the hole is never visible to scripts. A
hook to evict cues in the HTML spec indeed sounds like what's needed.
Since it cannot look at HTMLMediaElement.buffered, I think giving it a
single range within which it should evict all completely contained
cues would work. Example:

buffered was: [0, 10), [30, 50)

MSE does an overlapping append of [20, 40). Intersect that with
buffered to get the dropped range [30, 40). Now expand that range to
the adjacent hole in buffered to get [10, 40). This is the range you
would pass to the hook, which should then evict all cues completely
contained within it.

Would this work?

Philip

P.S. I found something odd while looking at TextTrack in MSE:
https://www.w3.org/Bugs/Public/show_bug.cgi?id=24370

Received on Thursday, 23 January 2014 03:52:19 UTC