Re: [blink-dev] WebVTT vs TTML Features from Silvia Pfeiffer on 2013-12-10 (public-texttracks@w3.org from December 2013)

From: Silvia Pfeiffer <silviapfeiffer1@gmail.com>
Date: Wed, 11 Dec 2013 09:59:33 +1100
To: Glenn Adams <glenn@chromium.org>
Cc: "public-texttracks@w3.org" <public-texttracks@w3.org>, John Luther <jluther@google.com>, Victor Carbune <vcarbune@chromium.org>, David Singer <singer@apple.com>, Nigel Megitt <nigel.megitt@bbc.co.uk>, Silvia Pfeiffer <silviapf@chromium.org>
Message-ID: <CAHp8n2kXu8iGLTs_qJXvcSi37C1UuRsajQewiMpfS150skLarw@mail.gmail.com>

Some corrections inline since there seem to be some misunderstandings.


On Wed, Dec 11, 2013 at 8:20 AM, Glenn Adams <glenn@chromium.org> wrote:
>
> On Wed, Dec 11, 2013 at 5:08 AM, Silvia Pfeiffer <silviapfeiffer1@gmail.com>
> wrote:
>>
>>
>> On 11 Dec 2013 07:56, "Glenn Adams" <glenn@chromium.org> wrote:
>> >
>> >
>> > On Wed, Dec 11, 2013 at 3:34 AM, David Singer <singer@apple.com> wrote:
>> >>
>> >>
>> >> On Dec 9, 2013, at 11:36 , Glenn Adams <glenn@chromium.org> wrote:
>> >>
>> >> > But not as well as you could it would seem: on Chrome, WebVTT is
>> >> > simply translated to cues referring to a CSS styled HTML fragment. Why not
>> >> > simple define an HTMLCue, and dispense entirely with VTTCue and the WebVTT
>> >> > parser. The WebVTT could be translated to a sequence of HTML cues on the
>> >> > server or using client JS.
>> >> >
>> >>
>> >> This is probably stating the obvious, but you asked.
>> >>
>> >> for at least two reasons:
>> >>
>> >> * we want this to be only one of many possible implementation choice
>> >> and
>> >> * we want there to be a simple expression of the timed cues that is not
>> >> dependent on an implementation choice
>> >
>> >
>> > Which would require the "simple expression" to be a semantic/stylistic
>> > superset of formats, which HTML/CSS is, but WebVTT isn't.
>>
>> Allowing all of html and css in cues is madness.
>
>
> I don't recall ever saying to allow "all" of html/css. The fact of the
> matter is that VTT implementations translate VTT cues to some subset of
> HTML/CSS. We are also defining a mapping from TTML to some subset of
> HTML/CSS.
>
> This process begs the question of whether any translation from an input
> format like TTML or VTT into HTML/CSS should be implemented in the browser
> rather than in, say, JS client code.

Yes, that's what we're doing with VTT when VTT is used in the browser
- we map it to HTML and CSS for rendering. Non-browsers can decided to
use a different approach for rendering.


> Going one step further, it is natural
> to ask if it makes sense to have servers deliver cues using HTML/CSS
> directly, thus even avoiding the need for JS client translation.

That makes browsers have to support all of HTML/CSS in cues, which, as
I said above, makes no sense.


>> Why did ttml not do that either?
>
> The current cue system defined in HTML5 is a new concept and mechanism. That
> it is defined in terms of getCueAsHTML() for rendering purposes begs the
> question of whether to use HTML in the first place.
>
> It has recently been suggested (very strongly indeed) that clients need not
> directly support TTML rendering since JS client code could perform
> translation into HTML/CSS fragments.  That is not an unreasonable
> suggestion, but it is inconsistent with saying that a client should directly
> support VTT to HTML/CSS translation, while saying a client shouldn't do this
> for TTML.

This is not the right place to discuss decisions that browsers made
about which formats they want to implement.


> My purpose in suggesting the potential utility of defining an HTMLCue as
> such is to demonstrate that one *could* dispense with any direct client
> support for VTT or TTML other than fetching or demultiplexing VTT/TTML
> content and passing it to client JS code to be translated into HTMLCue
> instances.

DataCue does that already. You can always expose normal HTML in a
DataCue and then simply render .text in a DocumentFragment.


>> Authors of captions need something that works for the use case, ie.
>> captioning, and not for publishing. If you want all of html+CSS, you don't
>> need a new format - you just write a web page.
>
>
> I never said "all of HTML/CSS". Note that at present, VTTCue.getCueAsHTML()
> doesn't explicitly limit what HTML is contained in the returned fragment.

Yes, VTT limits what HTML is contained in getCueAsHTML(). For example,
there will never be a <table> element in a DocumentFragment returned
by VTTCue.getCueAsHTML().


> If we wanted, we could finish the process of formally defining the VTT to
> HTML/CSS mapping, do the same for TTML, then constrain the fragments
> returned from getCueAsHTML() to the subset of HTML/CSS that is sufficient to
> render these formats.

The point of VTT is that it also allows VTT cues to be authored in
JavaScript and added to the list of cues. The content of such cues has
to be limited to what VTT cues support. If you want all of HTML/CSS,
you have to use DataCue.

Regards,
Silvia.

Received on Tuesday, 10 December 2013 23:00:21 UTC