Re: Handling live translation of cues to WebVTT

On Mon, Jan 27, 2014 at 2:29 AM, Brendan Long <B.Long@cablelabs.com> wrote:
> On Sat, 2014-01-25 at 23:47 +0700, Philip Jägenstedt wrote:
>
> It seems to me that when you do live streaming, you're going to be
> using Media Source Extensions, which require rather a lot of
> JavaScript.
>
> The existing <video> tag handles live video fine, except for captions. What
> I'd like is to be able to do is play live video with *no* JavaScript. I
> assume live video is (or will be) a common enough case that making it work
> by default is worth it.

Apple supports HLS, sure enough, but that's not supported everywhere.
Is there a way to do live streaming without JavaScript that works in
most or all browsers?

> So, what about the cost in solving this declaratively?
>
> 1. Is the special keyword NEXT for the end time the only new syntax
> that's required?
> 2. When should the end time of a NEXTy cue be updated? Is it when a
> new cue with a higher start time is parsed, or should e.g. a script
> modifying the start time of an existing cue also do something?
> 3. Should the endTime IDL attribute actually be modified, or should it
> simply be that a cue with end time NEXT is not considered active if
> there are any cues with a later end time?
> 4. What happens when you have two cues with the same start time that
> both have end time NEXT?
>
> I actually don't think "NEXT" is the best solution to our problem. Some
> internal discussion showed that it was more of a hack, when what we really
> want is the ability to rewrite cue end times later. We also want the ability
> to fix typos, so I proposed a syntax where any cue with the same ID as
> previous cues would completely replace them:
>
> http://lists.w3.org/Archives/Public/public-tt/2014Jan/0005.html
>
> I think it's a good solution, because it gives us typo correction and
> timestamp correction, with a syntax that matches existing WebVTT. I'm open
> to other solutions that let us change the end time though.

It seems like this is solving way more than just updating the end
time, but let's see.

Should the old cue be removed and a new one inserted, or should the
old cue be updated in place?

What should happen if the previous cue has already been modified by scripts?

What happens if there are multiple existing cues with the same id?

Whatever the answers to these questions, it seems like in order to be
efficient, for each parsed cue, one must check if there is already a
cue with the same id. This will either make the parser O(n^2) or
require a hash table, i.e. more memory, and this cost will be payed by
all users of WebVTT, not just live streaming with edits.

Philip

Received on Monday, 27 January 2014 03:01:36 UTC