Re: Metadata in the VTT file header (bug 15851), use cases (and a need to close this)

On Thu, Sep 13, 2012 at 5:23 AM, David Singer <singer@apple.com> wrote:
>
> On Sep 12, 2012, at 9:18 , Philip Jägenstedt <philipj@opera.com> wrote:
>
>> On Thu, 30 Aug 2012 02:13:03 +0200, David Singer <singer@apple.com> wrote:
>>
>>> On Aug 29, 2012, at 16:53 , Ian Hickson <ian@hixie.ch> wrote:
>>>
>>>> On Wed, 29 Aug 2012, David Singer wrote:
>>>>>
>>>>> 5) Time alignment. When WebVTT is used as the caption source for a
>>>>> system where timestamps are from an arbitrary origin (e.g. a continuous
>>>>> MPEG-2 Transport stream) we need a way to say that 'timestamp X in this
>>>>> VTT file aligns with Timestamp Y in the media stream' so as to get
>>>>> synchronization.  This is naturally put into the header.
>>>>
>>>> If there's a WebVTT file with fixed timestamps and a media stream with
>>>> arbitrary timestamps, then the only place where it makes sense to put the
>>>> synchronisation information is in the media stream. Putting it in the
>>>> WebVTT stream makes no sense; if you are able to adjust that stream then
>>>> why not just adjust the timestamps?
>>>
>>> Pardon?  You're suggesting completely re-writing the timestamps in the mpeg-2 transport stream so as to … do exactly what?  What we need is a mapping, not a need to re-write whole streams.
>>
>> MPEG-2 TS doesn't seem particularly relevant to the Web. In any case, adjusting all of the WebVTT timestamps using JavaScript would be trivial, as would rewriting the entire WebVTT file on the server.
>
> MPEG-2 TS timestamps cycle, and I don't think WebVTT ones are allowed to.  Re-writing entire files, finding every timestamp in them, is a lot less 'trivial' than a mapping.
>
> MPEG-2 TS continues to be a popular format, despite its age.  It is the media format behind our live streaming system, for example.

For the record: I don't think MPEG-2 TS are a use case for us, exactly
because they don't fit the model of a time-linear media resource. I
would expect them to be used only through adaptive streaming where
either the JS will map the data to the timeline using the MediaSource
API, or a DASH-type file will make sure the timing is provided
consistently.

Do you have a concrete example where that would not be sufficient?


>> For the record, I'm also not very enthusiastic about adding key-value metadata to WebVTT. Duplicating language and kind seem like a nice way to confuse, and browsers would just ignore the metadata anyway. The suggested syntax for multi-line values also looks pretty exotic to me.
>
>
> You're assuming a browser embedding; we envisage VTT used in a whole bunch of scenarios and locations.

Not surprisingly: I agree. :-)

> Given the need for a value-terminator and a header-block-terminator (blank line), escaping lines that look like those (plus escaping lines that look like the escape) seems pretty simple, obvious, and minimal.  The choice of the 'bracketing' characters is taste;  I suggested [[ and ]], others preferred "|" and ".".  As long as the terminator is rare in the obvious use-case (CSS), I don't think it really matters.

Again: I agree. Let's pick one way and then move on.

I've just tested what Simon suggested earlier in Google Chrome, Opera
Next and Safari and they can all deal with it:

WEBVTT
language: fr
kind: subtitles

STYLE
#foo { color:green }
i { font-family:serif }

#bar { color:red }

foo
00:00:00.000 --> 00:00:05.000
testing <i>testing</i>

It's pretty and browsers can deal with it. The rest of the tools will
then just have to deal with pushing everything before the first valid
cue (one that has a --> in it) into the header (even if that's
different to how the WebVTT spec is written). If we make specs for
such other tools outside the core WebVTT spec here, that should not be
a problem, I guess.

Cheers,
Silvia.

Received on Thursday, 13 September 2012 02:44:01 UTC