- From: Philip Jägenstedt <philipj@opera.com>
- Date: Fri, 17 Jan 2014 09:17:15 +0700
- To: Silvia Pfeiffer <silviapfeiffer1@gmail.com>
- Cc: Aaron Colwell <acolwell@google.com>, public-html <public-html@w3.org>
On Thu, Jan 16, 2014 at 5:44 AM, Silvia Pfeiffer <silviapfeiffer1@gmail.com> wrote: > > Hi Aaron, > > On Thu, Jan 16, 2014 at 9:02 AM, Aaron Colwell <acolwell@google.com> wrote: > > Hi, > > > > I have a few questions about TextTrack behavior for modified cues and > > in-band tracks. Please forgive me if some of these answers are in the spec > > and I didn't happen to find them. I did look. :) > > > > 1. If an attribute on a TextTrackCue like id or startTime is modified, is > > this cue supposed to be internally identified in the UA as "the cue formerly > > known as Y in the original source file"? > > Yes, that's how I'm reading it. > > > In the case of out-of-band text tracks this seems like a non-issue because > > the cues are inserted into the TextTrackList only at the beginning of > > playback when they are parsed. The parser visits the source data once and > > inserts cues only once. > > > > In the case of in-band tracks it is not clear what should be done because > > seeking back to an earlier portion of the clip would cause the file format > > parser to parse the cue data again and potentially try to insert it again. > > If the original cue was modified, then the cue data in the file and the > > values in the existing TextTrackCue wouldn't match so simple duplicate > > detection would not work. > > You're correct, the case of seeking is not covered in > http://www.w3.org/html/wg/drafts/html/master/embedded-content-0.html#sourcing-in-band-text-tracks > . It simply says "Populate the new text track's list of cues with the > cues parsed so far, following the guidelines for exposing cues, and > begin updating it dynamically as necessary." > > > > Is the UA expected to keep a unique internal id for each in-band > > TextTrackCue it creates so that it doesn't reinsert a cue that was inserted > > previously, but was modified? > > I think we have to consider what the expectation of the user (and Web > dev in this case) is and I think that if a cue was pulled in from an > in-band track and then changed, that change needs to be retained from > a user POV. It's possible we could expect the Web dev to keep track of > this and have to reapply the change when seeking happens. But if we > want to make it easy for the Web dev, we do it in the browser. > > I think that would have two consequences: > * a changed cue has to remain in the browser's cache and can't be > dropped to free space for loading other data > * some means of identifying the original cue has to be retained > > What you suggested with a unique internal id makes sense to me to > resolve the second issue. > > > I couldn't find anything in the HTML5 spec that clarified this situation. > > It seems we will need to add something to this effect unless we expect > the Web dev to keep track of this. I think the way this ought to work for vanilla <video> playback is that a cue is only ever sourced once. However, the text track implementation need not have any concept of changed/unchanged cues, it should suffice to have the demuxer keep a set of byte ranges for which it has already sourced cues. > > 2. When using the Media Source Extensions, it is possible to append data > > over the top of previously appended data to replace it. If such an append > > occurs that triggers the removal of a cue from the SourceBuffer then it's > > pretty clear that the TextTrackCue can be removed from the TextTrackList if > > it hasn't been modified. If the TextTrackCue has been modified though, > > should the UA remove it because the underlying cue in the SourceBuffer was > > removed, or should it leave the modified cue in the TextTrackList? > > > > My assumption is that unmodified cues should be removed, but modified ones > > should stay, but I just wanted to double check this assumption. > > Again, I wonder what the user's expectation would be here. Does the > replacing of previously appended data only refer to audio and video > data or also to text track data? If so, I think removing the modified > cue on the text track makes sense. If, however, you're just exchanging > the audio and video blocks, the cues should be retained. Hmm, the MSE case is a bit less obvious. It would be nice to have all cues behave the same way and not have a changed/unchanged concept, since that requires either a state bit hidden from scripts, or lots of comparisons to determine equality. How about every time the buffered ranges change in MSE, all cues for in-band tracks which do not overlap with the new buffered ranges are thrown out? (Out-of-band and script-created tracks would best be left alone I think.) > > 3. If we have an append that removes a cue and then another one that adds > > back the cue, then the proper behavior would be to resolve the first append > > according to the answer to #2 and then insert a new TextTrackCue for the > > second append since there is no way to know for sure whether it was related > > to the original cue. Does this sound reasonable? > > Would that not possibly result in doubly appending the same cue, even > if the first one was changed? > > I wonder: since the change to the original TextTrackCue was done by > the Web dev, maybe we can just leave the keeping track of the changes > to the Web dev. With these consequences: > * when seeking on an in-band track, the Web dev has to expect changed > cues to be removed and replaced back with the original cues > * when using MSE and previously appended data is replaced, the Web dev > has to expect changed cues to be removed and replaced with those from > the newly appended data > > I'm not sure as yet whether we should help the Web dev or not. But in > either case, we need to clarify the spec. The consequence of what I suggested for (1) and (2) above would be that a modified cue can be thrown away and then be re-sourced in its original form. This is easy to implement, and I imagine it wouldn't be much of a problem to authors in sane cases, and could be worked around with duplicate detection by the script when needed. After all, with MSE the scripts can know if it's pushed a buffer before and thus whether or not its cues have been sourced, while the browser cannot. Philip
Received on Friday, 17 January 2014 02:17:44 UTC