Re: Resolving TextTrackCue issues

On 9/5/13 8:22 AM, "Silvia Pfeiffer" <> wrote:

>On Thu, Sep 5, 2013 at 1:52 AM, Bob Lund <> wrote:
>> On 9/4/13 12:48 AM, "Silvia Pfeiffer" <> wrote:
>>>On Wed, Sep 4, 2013 at 2:14 PM, Eric Carlson <>
>>>> On Sep 3, 2013, at 5:16 PM, Silvia Pfeiffer
>>>>> On Wed, Sep 4, 2013 at 9:19 AM, Eric Carlson <>
>>>>>>> Do you expose the existance of these in-band captions somehow to
>>>>>>> I'm concerned that if the browser renders captions automatically on
>>>>>>> top of video without the JS developer being able to find out about
>>>>>>> how would the JS developer know that there are captions and to
>>>>>>> rendering another lot themselves - or rendering something else in
>>>>>>> space of the captions?
>>>>>>  On versions of the OS where it is possible for WebKit to "take
>>>>>> rendering of in-band captions from the media engine, they behave
>>>>>> out-of-band tracks: in-band tracks are part of the video.textTracks
>>>>>>and the
>>>>>> cues are part of track.cues/activeCues (when appropriate).
>>>>> So for the JS dev they are exposed as instances of the old
>>>>> TextTrackCue interface?
>>>>   Yes.
>>>>> The VTTCue interface sufficiently satisfies this use case then?
>>>>   True, but the in-bad cues may or may not be WebVTT originally. Is
>>>>there an advantage to using VTTCue instead of the old TextTrackCue
>>>getCueAsHTML() of VTTCue interprets what's in .text as WebVTT content.
>>>Exposing other format content in that way would not make much sense,
>>>just as parsing HTML through a Word Doc format parser makes not much
>>>sense. So, IMHO neither VTTCue nor the old TextTrackCue interface are
>>>appropriate here since you're not dealing with WebVTT content
>>>>> What is the content in .text ?
>>>>   The cue text.
>>>Just to clarify: Is that a plain text version of the original cue
>>>content? For example, for CEA608 content from a SCC file it would just
>>>be the plain text of the cue stripped of all the other characters? Or
>>>in the case of SRT, all tags are stripped and just the plain text is
>>>If you do that, then we're just pretending the in-band text track was
>>>actually WebVTT content.
>>>If you don't return plain text, but the actual original content of the
>>>cue, you get the wrong behaviour with getCueAsHTML() and the rendering
>>>algorithm of the old TextTrackCue or the new VTTCue interface.
>>>>>>  On versions of the OS where the system frameworks do not have the
>>>>>> necessary API to override cue rendering, in-band tracks are part of
>>>>>> video.tracks so they can be enabled/disabled by script but cues are
>>>>>> by the media engine.
>>>>> In this case, I assume only the existence of the track, but not of
>>>>> cues is exposed to JS? I.e. track.cues/activeCues is empty? Or are
>>>>> listing fully-abstract TextTrackCue instances here to at least
>>>>> starttime/endtime to the JS devs?
>>>>   Correct, the cues are not available to WebKit at all in this case.
>>>OK. Would you consider using the abstract TextTrackCue interface of
>>>the WHATWG spec for exposing these cues, so JS developers can at least
>>>react to cue change events?
>>>>> And an orthogonal question: you've probably seen the Cablelabs spec
>>>>> for exposing MPEG-2 in-band text tracks of different types to
>>>>> Seeing as both Cablelabs and the HTML spec are trying to accommodate
>>>>> using in-band text tracks in HTML/JS, what do you suggest is the best
>>>>> way forward to specify this?
>>>>> * Program Map Table: would you suggest to use VTTCue with
>>>>> @kind=metadata, GenericCue, or a new PMTCue interface?
>>>>   It seems to me that text track content that a UA does not render
>>>>itself is, by definition, metadata.
>>>In this case, that definition works, because no @kind value matches
>>>the semantic content of a PMT track.
>>>But caption data that is not rendered is still semantically of
>>>> Again, it is not WebVTT data so is there an advantage to VTTCue versus
>>>>old TextTrackCue interface?
>>>Not between those two, since they are identical.
>>>>> * CEA708 track: assuming we don't want to introduce CEA708Cue, how
>>>>> would that best be supported?
>>>>   Are any browsers planning to support 708 captions natively?
>>>What does "support" mean?
>>>Parse them out of a MPEG-2 file (like the Cablelabs spec suggests) and
>>>expose them to JS?
>>>Or.. Rendering them like you do for formats that the QuickTime
>>>framework already supports, but without exposing the original data and
>>>pretending it's WebVTT?
>>>Or .. go all the way to exposing the format with its own features?
>>>Only the 3rd option requires specification of a new interface. The
>>>first option requires something like the GenericCue interface, since
>>>we need to give the original content of the cue to JS to parse.
>>>It seems that Cablelabs expected that there would be browsers that
>>>implement parsing, but not rendering for text tracks in MPEG-2. It
>>>would be good if there was a statement from browser vendors (or set
>>>top box developers that use browser rendering engines or the like) if
>>>that is a realistic use case.
>> We've implemented UA rendering of 708 captions - this is done by
>> converting 708 to WebVTT and then taking advantage of the existing
>> rendering.
>Out of curiosity: In the .text attribute of VTTCue / the old
>TextTrackCue interface, are you exposing the 708 cue content or the
>converted WebVTT cue content?
>> The spec calls for making the caption data from the MPEG-2 TS
>> available to JS via .text.
>Right. Is that on a track with @kind=captions or @kind=metadata?

In the case where the UA can render the CC, @kind=captions, else

>Thanks for your input!

Received on Thursday, 5 September 2013 19:23:47 UTC