Re: Resolving TextTrackCue issues from Eric Carlson on 2013-09-05 (public-html@w3.org from September 2013)

From: Eric Carlson <eric.carlson@apple.com>
Date: Thu, 05 Sep 2013 09:08:15 -0700
To: Silvia Pfeiffer <silviapfeiffer1@gmail.com>
Cc: Jer Noble <jer.noble@apple.com>, Glenn Adams <glenn@skynav.com>, Philip Jägenstedt <philipj@opera.com>, public-html <public-html@w3.org>, Ian Hickson <ian@hixie.ch>, Bob Lund <B.Lund@cablelabs.com>
Message-id: <1C4AC128-56C9-42EC-BDF4-B9FAC0E13FF9@apple.com>

On Sep 5, 2013, at 7:39 AM, Silvia Pfeiffer <silviapfeiffer1@gmail.com> wrote:

> On Thu, Sep 5, 2013 at 2:56 AM, Eric Carlson <eric.carlson@apple.com> wrote:
>> 
>> On Sep 3, 2013, at 11:48 PM, Silvia Pfeiffer <silviapfeiffer1@gmail.com> wrote:
>> 
>>> On Wed, Sep 4, 2013 at 2:14 PM, Eric Carlson <eric.carlson@apple.com> wrote:
>>>> 
>>>> On Sep 3, 2013, at 5:16 PM, Silvia Pfeiffer <silviapfeiffer1@gmail.com> wrote:
>>>> 
>>>>> What is the content in .text ?
>>>>> 
>>>> The cue text.
>>> 
>>> Just to clarify: Is that a plain text version of the original cue
>>> content? For example, for CEA608 content from a SCC file it would just
>>> be the plain text of the cue stripped of all the other characters? Or
>>> in the case of SRT, all tags are stripped and just the plain text is
>>> exposed?
>>> 
>> 
>>> If you do that, then we're just pretending the in-band text track was
>>> actually WebVTT content.
>>> 
>>  AVFoundation converts the in-band cue data (CEA608, QTText, 3GPP timed text, etc) to plain text, which sometimes has position and style information. WebKit converts that to WebVTT.
>> 
>>  This is "pretending" the in-band data is WebVTT, but does that matter? I think this is actually an advantage, both because it makes our implementation simpler and because it makes it simpler for developers.
> 
> I'd be happy if everything was exposed as WebVTT. However, that also
> requires that the cue content (and not just the attributes) are
> converted to WebVTT format, unless they are @kind=metadata.
> 
> Where you have styling information associated with the cue content
> (italics, bold, color, etc), are you also converting the cue text to
> WebVTT and thus exposing that in .text for rendered cues?
> 
  Yes, where the styling information is one of the WebVTT cue components.


> 
> 
>>>>> And an orthogonal question: you've probably seen the Cablelabs spec
>>>>> for exposing MPEG-2 in-band text tracks of different types to HTML[1].
>>>>> Seeing as both Cablelabs and the HTML spec are trying to accommodate
>>>>> using in-band text tracks in HTML/JS, what do you suggest is the best
>>>>> way forward to specify this?
>>>>> 
>>>>> * Program Map Table: would you suggest to use VTTCue with
>>>>> @kind=metadata, GenericCue, or a new PMTCue interface?
>>>>> 
>>>> It seems to me that text track content that a UA does not render itself is, by definition, metadata.
>>> 
>>> In this case, that definition works, because no @kind value matches
>>> the semantic content of a PMT track.
>>> But caption data that is not rendered is still semantically of @kind=captions.
>>> 
>>  I disagree. A generic script that sees a track with kind=captions is going to expect the UA to render those "captions" when it makes the track visible.
>> 
>>  It seems clear to me that "caption data" that a UA is not able to render is metadata. This matches the definition of metadata in the spec: "Tracks intended for use from script. Not displayed by the user agent"
> 
> So would the "in-band metadata track dispatch type" attribute tell the
> JS developer what is really in this track?
> 
> For example, TTML in MP4, not supported by the browser, not rendered
> by the browser, but able to be extracted from MP4 by the browser would
> in your suggestion end up as VTTCue objects of @kind=metadata with
> @inBandMetadataTrackDispatchType conveying some information about it
> being captions?
> 
  Yes, exactly.

eric

Received on Thursday, 5 September 2013 16:08:50 UTC