Re: TextTrack questions

On Thu, Jan 16, 2014 at 5:44 AM, Silvia Pfeiffer
<silviapfeiffer1@gmail.com> wrote:
>
> Hi Aaron,
>
> On Thu, Jan 16, 2014 at 9:02 AM, Aaron Colwell <acolwell@google.com> wrote:
> > Hi,
> >
> > I have a few questions about TextTrack behavior for modified cues and
> > in-band tracks. Please forgive me if some of these answers are in the spec
> > and I didn't happen to find them. I did look. :)
> >
> > 1. If an attribute on a TextTrackCue like id or startTime is modified, is
> > this cue supposed to be internally identified in the UA as "the cue formerly
> > known as Y in the original source file"?
>
> Yes, that's how I'm reading it.
>
> > In the case of out-of-band text tracks this seems like a non-issue because
> > the cues are inserted into the TextTrackList only at the beginning of
> > playback when they are parsed. The parser visits the source data once and
> > inserts cues only once.
> >
> > In the case of in-band tracks it is not clear what should be done because
> > seeking back to an earlier portion of the clip would cause the file format
> > parser to parse the cue data again and potentially try to insert it again.
> > If the original cue was modified, then the cue data in the file and the
> > values in the existing TextTrackCue wouldn't match so simple duplicate
> > detection would not work.
>
> You're correct, the case of seeking is not covered in
> http://www.w3.org/html/wg/drafts/html/master/embedded-content-0.html#sourcing-in-band-text-tracks
> . It simply says "Populate the new text track's list of cues with the
> cues parsed so far, following the guidelines for exposing cues, and
> begin updating it dynamically as necessary."
>
>
> > Is the UA expected to keep a unique internal id for each in-band
> > TextTrackCue it creates so that it doesn't reinsert a cue that was inserted
> > previously, but was modified?
>
> I think we have to consider what the expectation of the user (and Web
> dev in this case) is and I think that if a cue was pulled in from an
> in-band track and then changed, that change needs to be retained from
> a user POV. It's possible we could expect the Web dev to keep track of
> this and have to reapply the change when seeking happens. But if we
> want to make it easy for the Web dev, we do it in the browser.
>
> I think that would have two consequences:
> * a changed cue has to remain in the browser's cache and can't be
> dropped to free space for loading other data
> * some means of identifying the original cue has to be retained
>
> What you suggested with a unique internal id makes sense to me to
> resolve the second issue.
>
> > I couldn't find anything in the HTML5 spec that clarified this situation.
>
> It seems we will need to add something to this effect unless we expect
> the Web dev to keep track of this.

I think the way this ought to work for vanilla <video> playback is
that a cue is only ever sourced once. However, the text track
implementation need not have any concept of changed/unchanged cues, it
should suffice to have the demuxer keep a set of byte ranges for which
it has already sourced cues.

> > 2. When using the Media Source Extensions, it is possible to append data
> > over the top of previously appended data to replace it. If such an append
> > occurs that triggers the removal of a cue from the SourceBuffer then it's
> > pretty clear that the TextTrackCue can be removed from the TextTrackList if
> > it hasn't been modified. If the TextTrackCue has been modified though,
> > should the UA remove it because the underlying cue in the SourceBuffer was
> > removed, or should it leave the modified cue in the TextTrackList?
> >
> > My assumption is that unmodified cues should be removed, but modified ones
> > should stay, but I just wanted to double check this assumption.
>
> Again, I wonder what the user's expectation would be here. Does the
> replacing of previously appended data only refer to audio and video
> data or also to text track data? If so, I think removing the modified
> cue on the text track makes sense. If, however, you're just exchanging
> the audio and video blocks, the cues should be retained.

Hmm, the MSE case is a bit less obvious. It would be nice to have all
cues behave the same way and not have a changed/unchanged concept,
since that requires either a state bit hidden from scripts, or lots of
comparisons to determine equality. How about every time the buffered
ranges change in MSE, all cues for in-band tracks which do not overlap
with the new buffered ranges are thrown out? (Out-of-band and
script-created tracks would best be left alone I think.)

> > 3. If we have an append that removes a cue and then another one that adds
> > back the cue, then the proper behavior would be to resolve the first append
> > according to the answer to #2 and then insert a new TextTrackCue for the
> > second append since there is no way to know for sure whether it was related
> > to the original cue. Does this sound reasonable?
>
> Would that not possibly result in doubly appending the same cue, even
> if the first one was changed?
>
> I wonder: since the change to the original TextTrackCue was done by
> the Web dev, maybe we can just leave the keeping track of the changes
> to the Web dev. With these consequences:
> * when seeking on an in-band track, the Web dev has to expect changed
> cues to be removed and replaced back with the original cues
> * when using MSE and previously appended data is replaced, the Web dev
> has to expect changed cues to be removed and replaced with those from
> the newly appended data
>
> I'm not sure as yet whether we should help the Web dev or not. But in
> either case, we need to clarify the spec.

The consequence of what I suggested for (1) and (2) above would be
that a modified cue can be thrown away and then be re-sourced in its
original form. This is easy to implement, and I imagine it wouldn't be
much of a problem to authors in sane cases, and could be worked around
with duplicate detection by the script when needed. After all, with
MSE the scripts can know if it's pushed a buffer before and thus
whether or not its cues have been sourced, while the browser cannot.

Philip

Received on Friday, 17 January 2014 02:17:44 UTC