RE: TextTrackCue discussions

Yes, the focus of the API is both a full TTML DOM and a mapping to the Cue list from HTML. TTML is indeed similar to SVG or MathML, and might conceivably one day be embedded in HTML in the same way. However even absent that, authoring and transcribing tools will be more interested in the former, while play-out engines may be more interested in the latter. I also put in a bridge between them, so that for example it is possible to create the TTML file from a list of live cues.  TTML markup is of course a lot richer than just time-aligned cue sequences; although such can be generated from a TTML file. 

I don't really see the TTML Dom part having an analogue in WebVTT, so I'm not especially concerned with consistency there; I'd like it to look and feel more like the HTML/SVG/MathML types of DOM so I modelled the main TTML DOM after http://www.w3.org/TR/DOM-Level-2-HTML/ecma-script-binding.html which has types for all the elements. I haven't put in all the attributes in the TTML version yet, which is why it perhaps looks a little odd as is

For the Cue centric part, I guess it's a case of whether you prefer typed or untyped hierarchies. If the style-du-jour is untyped, then I'm fine with an enum of element names. On reading it's mostly just the difference between using typeof operator or matching an attribute against an enum. While I definitely foresee people building TTML files ab-initio in browser based applications; I guess they are less likely to build lists of cue objects (depending on how we solve the problem of live captions). I could definitely see reducing the cue representation types down to a single type with an enum, but I did it the other way for consistency.




-----Original Message-----
From: Silvia Pfeiffer [mailto:silviapfeiffer1@gmail.com] 
Sent: 23 July 2013 05:35
To: Glenn Adams
Cc: TTWG
Subject: Re: TextTrackCue discussions

This is feedback only to the TTWG, because cross-posting is discouraged.

On Tue, Jul 23, 2013 at 2:00 PM, Glenn Adams <glenn@skynav.com> wrote:
>
>
>
> On Mon, Jul 22, 2013 at 9:16 PM, Silvia Pfeiffer 
> <silviapfeiffer1@gmail.com>
> wrote:
>>
>> That's a fair observation and right now each file format (in 
>> particular WebVTT) provides for all the semantics through the same 
>> internal markup. I suppose we can continue creating more WebVTT cue 
>> settings and markup for all cue kinds for a while before we create 
>> something that creates a problem. Also, there is not currently a 
>> specification of a different cue JS object (such as TTMLCue). So, 
>> let's cross that bridge when we get to it.
>
>
> We are already at that bridge. There is an early draft specification 
> of a TTMLCue in [1], with editing actions assigned to begin bringing 
> this into
> TTML2 ED. There is also development work underway to implement this 
> functionality in multiple browsers.
>
> [1] http://www.w3.org/wiki/TTML/changeProposal006

That proposal has an extensive amount of new objects. Why would a JavaScript developer need that many objects? We only need to introduce a new object when we expect JS devs to create such objects. I don't really see that happening for most of the objects listed in this proposal. BTW: it would be nice if TTMLTextTrackCue be renamed to TTMLCue.


> That being said, if we had sufficient time on our hands, we could 
> attempt to merge the semantics of the two different cue formats such 
> that specifying a single, general purpose interface would suffice.

I actually thought it would be simple: both are just descriptions of time-aligned cue sequences. So we'd need a description of a cue, of a cue sequence, and of the text track. However, it seems that the focus of the TTML API is to provide a full DOM representation for TTML (not unlike HTML) rather than just the necessary data as a text track format (which is what WebVTT does). That tells me that the focus of this API is a different one from the TextTrackAPI and WebVTT: it's like a completely separate spec with it's full DOM similar to SVG or MathML or another markup language.

> However, the existing
> TextTrackCue/VTTCue interface was proposed and implemented without 
> taking into account the semantics of a TTML based cue. It may turn out 
> that there is wide overlap, but there may be non-overlapping semantics 
> as well, and, absent a thorough comparative analysis, we can't yet 
> derive reliable conclusions.

The problem is that you're trying to represent every single piece of information in TTML as a separate JS object. That's not what WebVTT
does: we don't have WebVTTUnderlineElement or WebVTTVoiceElement objects in WebVTT. They are just markup in WebVTT that gets converted to a HTML fragment with getCueAsHTML(). There is no need for explicit objects since none of those objects actually end up in the HTML page's DOM.

If all you retain is TTMLTextTrackCue and add some of the, then we're back to being conformant with the TextTrack API.


> Our options seem to be:
>
> (1) proceed with defining separate VTTCue and TTMLCue interfaces, 
> possibly moving common functionality to a their common TextTrackCue 
> interface over time (future versions);

Yes, I think they need to be different because they contain different markup. There's no means to merge them to a single interface.


> (2) create a new common interface design after a thorough comparative 
> analysis of the semantics of the two format's cue semantics;

I don't think that's possible.

The proposed interface in
http://www.w3.org/wiki/TTML/changeProposal006 is a mix between two
things:
(1) it hooks into the HTML spec's TextTrackCue API through TTMLTextTrackCueAPI (which I think would be nice to be called TTMLCue)
(2) it provides a full DOM representation of a TTML file (btw:
TTMLTexttrack should rather be called TTMLFile - that would be more
appropriate)

(1) is what the browsers need. I'm not sure who needs (2). Possibly transcoding applications.


> I would suggest that the first option is more practical and will yield 
> better short term results. However, since the draft re-charter for the 
> TTWG includes language to develop a common basis for semantics, then 
> the second option may be pursued in that context.

I'm still unclear about what a "common semantic" means. Sean seems to have an idea, but I continue to be baffled by the concept and what the consequences are.

Thanks for sharing that document! It was most interesting to see this.

Regards,
Silvia.

Received on Tuesday, 23 July 2013 14:15:05 UTC