Re: Metadata in the VTT file header (bug 15851), use cases (and a need to close this)

On Aug 30, 2012, at 10:22 , Ian Hickson <ian@hixie.ch> wrote:

>>> As in:
>>> 
>>>  WEBVTT
>>> 
>>>  00:11.000 --> 00:13.000
>>>  <v Roger Bingham>We are in New York City
>>> 
>>>  OFFSET -01:00.000
>>> 
>>>  01:13.000 --> 01:16.000
>>>  <v Roger Bingham>We're actually at the Lucern Hotel, just down the 
>>> 
>>> ...or some such.
>> 
>> I can't see how you could do that in a backwards-compatible fashion.  
> 
> I don't understand. Backwards-compatible with what?

Existing parsers will try to parse that OFFSET line as a cue, and fail/complain.  That's a compatibility failure.

>> I disagree that special casing every new piece of data (inline styles, 
>> URLs to external stylesheets, language tags, "kind" tags) is less 
>> complex than defining the format once so parsers don't have to keep 
>> changing.
> 
> The complex part of adding CSS to WebVTT implementations is not the syntax 
> for adding a new block to the WebVTT parser. That part is trivial.

Not if you don't want to break existing implementations, or invalidate existing files.

> This is the same kind of reasoning that leads to things like XML. It does 
> not lead to simple languages.

You're appealing to emotive arguments, which ill becomes you.

Setting a stage where we can see where and how we can add features in a backwards-compatible way is something you, and the rest of us, do quite often, in many places.  Having general parsing rules is quite common in many standards, including ones you have written.

> The solution is not to try to come up with an uber-meta-language that 
> solves all possible future problems (except all the ones we didn't think 
> of -- e.g. XML totally fails at non-tree data structures). The solution is 
> to just not change the language very often.

No-one is trying to come up with an uber-meta-metalanguage, just a way to parse the VTT header.

> It's not a standard. Doesn't have to be. It's vendor-proprietary data that 
> differs from vendor to vendor.

We have an interop question here, not an internal vendor question.

>> It is indeed conceivable that many proprietary name-value tags will be 
>> created in addition to the small set that we have suggested for kind, 
>> language, label, in-band style sheet, and external style sheet. And 
>> indeed the CEA608 document that I've proposed already shows some that 
>> are typically used by caption providers.
> 
> This is the disaster that IMHO we must avert.

Preferably by defining the syntax, and then the names, which is what the rest of us are trying to do.

> The data I'm talking about _isn't consumed_. So there's no parser to 
> write.

Then you're missing the conversation.

> 
>>> Formats that have no general name-value pair syntax, e.g. CSS and 
>>> JavaScript, have not suffered the _slightest_ for it.
>> 
>> But you're wrong: CSS IS a set of name-value pairs - that's the file 
>> format.
> 
> It's not, but even the parts that appear to be name-value pairs aren't 
> arbitrary name-value pairs.

CSS has a general parser for its syntax, just like what we agreed on this list for the VTT header.  The question of what names are allowed or recognized is different.

> There's no benefit to that. Certainly none that anyone has described as a 
> valid use case for WebVTT.

(IYHO)


David Singer
Multimedia and Software Standards, Apple Inc.

Received on Thursday, 30 August 2012 18:58:35 UTC