Re: A new proposal for how to deal with text track cues

Hi all,

FYI: I have dropped my proposal from this thread since nobody seemed
to agree with my basic proposal of having cue objects created based on
semantics rather than file format.

So, the only changes that I've made to the WebVTT specification are:
* rename WebVTTCue to VTTCue (for brevity)
* extracted out the description of the "serialisation"
(http://dev.w3.org/html5/webvtt/#webvtt-file-structure) from the cue
payload (http://dev.w3.org/html5/webvtt/#types-of-webvtt-cue-payload)
in the syntax
* I still have to address rendering differences between chapters and
captions/subtitles (http://dev.w3.org/html5/webvtt/#rendering)
* I also still have to address what markup to add for descriptions
tracks (https://www.w3.org/Bugs/Public/show_bug.cgi?id=10944)

Cheers,
Silvia.


On Tue, Jun 25, 2013 at 12:40 PM, Silvia Pfeiffer
<silviapfeiffer1@gmail.com> wrote:
> On Tue, Jun 25, 2013 at 11:28 AM, Glenn Maynard <glenn@zewt.org> wrote:
>> I think I see the basic issue.  WebVTTCue isn't the output of the cue
>> parser, it's the output of the WebVTT parser.  None of the fields of
>> WebVTTCue are inherently specific to kind=subtitles (or caption); they're
>> simply the output of the file-level WebVTT parser.  Not all of them are used
>> by every renderer, but they're all valid, and they're all parsed in the same
>> way regardless of @kind.
>>
>> The cue parser ("WebVTT cue text parsing rules") is run only if you call
>> getCueAsHTML (or if the cue is actually rendered, of course), and the output
>> of that is a DocumentFragment.
>
> Correct. That all fields are applied the same to all content is what I
> am trying to point out as being the problem. Every feature that we add
> because we need it, e.g. for one particular use case is applied to all
> cue kinds. But I think I can live with that situation a little longer.
>
>>  If you want to add, for example, a TTML
>> parser for kind=chapters, then the only addition to WebVTTCue would be a
>> getCueAsTTML() method, which would invoke the TTML parser and return an
>> interface specific to that format, just like getCueAsHTML() does for
>> captions.
>
> That's not quite how I understand it:
> getCueAsHTML() takes a cue and applies the "WebVTT cue text parsing
> rules" and the "WebVTT cue text DOM construction rules" to get back
> HTML.
> getCueAsTTML() would take a cue and apply the "WebVTT cue text parsing
> rules" and new "TTML cue text construction rules", thus returning TTML
> for WebVTT.
>
> A TTML parser would require a TTMLCue object with its own
> getCueAsHTML() function to return a HTML fragment.
>
>
>> On Fri, Jun 14, 2013 at 3:57 AM, Silvia Pfeiffer <silviapfeiffer1@gmail.com>
>> wrote:
>>>
>>> I was told that TTML indeed supports chapters, though I haven't seen
>>> any TTML files in use for that purpose. They would also just be timed
>>> cues with plain text, I was told.
>>
>>
>> By the way, while I don't know a lot about TTML, using it for chapters
>> smells like a really bad idea.  The spec is several times the size of
>> WebVTT, it's based on XML--that always makes me nervous--and generally looks
>> heavy and overdesigned.  Hopefully WebVTT doesn't need to depend on a big
>> and unproven spec, if they're really just timed cues with plain text.  (I
>> can't find chapters in the spec--the word "chapter" appears only once, in an
>> example.)
>
> Agreed, I also don't want to add a TTML parser into a WebVTT object.
>
>
>> On Thu, Jun 13, 2013 at 11:34 PM, Silvia Pfeiffer
>> <silviapfeiffer1@gmail.com> wrote:
>>>
>>> >> Once a WebVTT file is parsed into a list of cues, the browser should
>>> >> not have to care any more that the list of cues came from a WebVTT
>>> >> file or anywhere else. It's a list of cues with a certain type of
>>> >> content that has a parsing and a rendering algorithm attached.
>>> >
>>> > If it has a rendering algorithm attached, then the browser does care
>>> > that it
>>> > came from a WebVTT file, since that's what that flag indicates.
>>>
>>> The object is called WebVTTCue - thus the browser cares that it came
>>> from a WebVTT encapsulation.
>>
>>
>> Changing that to "TextTrackData" with an attached rendering algorithm of
>> "WebVTT" is only a more complicated way of saying the same thing--it still
>> cares that it's WebVTT.
>
> I built my proposal on the expectation that in future we will want to
> have "WebVTT subtitle cue text parsing rules" and "WebVTT chapter cue
> text parsing rules" and "WebVTT description cue text parsing rules"
> because they have sufficiently different cue settings and cue node
> objects that we'd want to separate the parsing rules. But I suppose we
> can continue creating more cue settings and markup for all cue kinds
> for a while before we create something that creates a problem. So,
> let's cross that bridge when we get to it.
>
>
>>> >  I'm also
>>> > confused that you're saying you want to split the interfaces for
>>> > metadata
>>> > and cues, but that if the platform supports a DVD bitmap caption
>>> > interface
>>> > in the future, you'd want those to use the same interface, even though
>>> > their
>>> > data and attributes would be completely different.
>>>
>>> Right now, we have one catch-all type cue: cues of kind metadata.
>>> Other cues have explicit parsing and rendering algorithms associated.
>>>
>>> Think about it like that: we understand captions/subtitle cues, have
>>> defined an explicit parsing and rendering algorithm. We are about to
>>> do the same for chapter cues - and after that we will do the same for
>>> description cues. These are similar to the <img>, <video>, <audio>
>>> tags that we have defined to pull in embedded content. That leaves
>>> metadata cues for which we don't define a parsing or rendering
>>> algorithm - just like <embed>/<object> remains as a generic means to
>>> pull other types of embedded content into the browser.
>>
>>
>> (This response isn't related to what I said, so I'll restate.)
>
> OK.
>
>> If you think WebVTT subtitles should have a separate interface from WebVTT
>> metadata, then it's confusing that you would want to use the same interface
>> for DVD subtitles and WebVTT subtitles, since the latter two are much more
>> different than the former two.
>
> That's not what I suggested - I only suggested using the same objects
> for data that has the same content. When we get DVD subtitles, we need
> a new DVDSubtitleCue interface.
>
>
>>  You said that you want browsers to have a
>> list of source-independent cues with a rendering algorithm attached,
>
> Right - for those data types that have the same content.
>
>> which
>> means you'd have a SubtitleCue with all of the attributes from both WebVTT
>> and DVD, and a "algorithm=dvd" flag.
>
> That does not follow from my statements. I am not limiting the number
> of XXCue objects to the list of kinds that we have - that was not the
> implication.
>
> I'll have a bit more of a think about this. I'm currently thinking of
> only renaming WebVTTCue to VTTCue (for simplicity).
>
> Cheers,
> Silvia.

Received on Tuesday, 23 July 2013 03:13:04 UTC