Re: Metadata in the VTT file header (bug 15851), use cases (and a need to close this) from Silvia Pfeiffer on 2012-08-30 (public-texttracks@w3.org from August 2012)

From: Silvia Pfeiffer <silviapfeiffer1@gmail.com>
Date: Thu, 30 Aug 2012 12:49:14 +1000
To: Ian Hickson <ian@hixie.ch>
Cc: David Singer <singer@apple.com>, public-texttracks <public-texttracks@w3.org>
Message-ID: <CAHp8n2k-P_2gnGfc1abqYkEkcGL5YZQbHhn7bFGgjTaXjvxuvA@mail.gmail.com>
On Thu, Aug 30, 2012 at 10:46 AM, Ian Hickson <ian@hixie.ch> wrote:
> On Wed, 29 Aug 2012, David Singer wrote:
>> On Aug 29, 2012, at 16:53 , Ian Hickson <ian@hixie.ch> wrote:
>> > On Wed, 29 Aug 2012, David Singer wrote:
>> >>
>> >> 1) Authoring.  Quite often caption files are authored/written in a
>> >> different workflow from the media, and must be re-united later. We'd
>> >> like to keep track of attributes of the files in-band, so that they
>> >> don't get lost (e.g. the language of the captions), and indeed, of
>> >> the proposed values for the <track> element attributes when the file
>> >> is referenced from HTML. It can also be useful to include a link-back
>> >> to the content that was captioned, using an identifier (e.g. URL).
>> >
>> > This would be entirely addressed by in-file comments, and doesn't need
>> > name-value pairs.
>>
>> Really?  I thought comments were free-form, and HTML5 attributes had a
>> name and a value.  Perhaps you could indicate how software could parse
>> the 'comments' to form the initial/suggested attribute values?
>
> However they want to. It's a one-vendor problem, after all.

Your suggestion will lead to every vendor re-inventing another
name-value markup means at the start of WebVTT.
I don't call that a standard.


[..]
>> Can you explain why you want to resist what many of us see as a natural
>> direction to go?
>
> Two reasons.
>
> First, there have really not been any compelling use cases. All the use
> cases presented are either better handled in other ways in WebVTT (e.g.
> how to embed styles, offsets), or are already handled sufficiently by
> WebVTT now or WebVTT with other additions like the block comment syntax
> (e.g. anything involving proprietary workflow additions only needed during
> production). Adding a feature that doesn't have compelling use cases is a
> recipe for disaster
>
> Second, what we have seen with HTML is that providing arbitrary name-value
> pair syntax that anyone can plug into tends to lead authors down this
> massive rabbit hole of timewasting. People see name-value pair metadata
> syntax and they go crazy adding all kinds of metadata in random syntaxes
> to it, with no common vocabulary, no common processing model, and with
> absolutely no idea what is ever going to consume it. And then: nothing
> consumes it.

That is because the browsers generally don't make use of the
name-value pairs and Web pages are written basically for browsers, not
for anything else.

This is not the case here. Here we deal with an industry that is using
caption and other text track files to display in different players,
many of which are not Web browsers. Files are being embedded into
video files and extracted again, all without a Web browser. All the
information that we need has to be self-contained - we cannot rely on
a Web page providing additional information.

It is indeed conceivable that many proprietary name-value tags will be
created in addition to the small set that we have suggested for kind,
language, label, in-band style sheet, and external style sheet. And
indeed the CEA608 document that I've proposed already shows some that
are typically used by caption providers.

However, putting our head in the sand and ignoring the problem will
simply lead to everyone developing a different approach to providing
name-value pairs, thus making it impossible to interoperate with other
tools.

> It is a _huge_ waste of time. Nothing _can_ consume it,
> because the data is of so poor quality (having never been tested) and is
> of so many different formats (there being either no standard or so many
> standards for how to expose it).

On the contrary: it is a huge waste of time to have to write a
different name-value-pair parser for every WebVTT provider.

If we standardize on a structure to how name-value pairs are provided,
at least we can ignore those fields that we don't understand ("we"
being the stand-alone video player developers, and even the Web
developers on the backend that feed their <track> elements from the
data they find in the WebVTT file). And what's more: tools will know
how to transparently process those name-value fields and do something
with them, such as add them to a DB or add them to file headers when
encapsulating.

> Formats that have no general name-value
> pair syntax, e.g. CSS and JavaScript, have not suffered the _slightest_
> for it.

But you're wrong: CSS IS a set of name-value pairs - that's the file
format. I don't want to compare a content markup language with a
programming language, so I'm not going to compare WebVTT to JS.


> People still put their proprietary data in those formats (e.g.
> "javadoc"-style documentation in JavaScript), but they do so _when they
> need it_, with testing, with consumers. They include their copyrights in
> comments, and are none the worse for it. You don't get week-long threads
> on forums of people asking what syntax their copyright metadata in CSS
> should be, because the answer is trivial: put it in a comment.

Oh, but even there you will find that there are tools that will
process your comments to mean something and if you don't follow their
approach, you don't get the advantage of getting your Copyright
recognized by those tools.


>> You even proposed a syntax for it, yet you seem to be reaching for
>> reasons not to do it.
>
> Honestly I feel like it is you who is reaching for reasons to do it.
>
> But having said that: we _should_ _always_ be looking for reasons _not_ to
> do something: every time we add a feature to the Web platform, it has
> massive long-term costs. We should be hugely reluctant to do so. It is our
> responsibility as language designers to keep everything out of our
> languages unless the cost is justified by the massive gains. The default
> answer to every proposal should be "no" followed only then by "why?". If
> we can't find a _strong_ justification, we should not include it.

Agreed, in particular for HTML which is already massive. Saying "no"
has to stop, however, where us doing nothing will simply lead to us
becoming irrelevant, because the world needs that particular feature.
It does in this case - hopefully my arguments above have been able to
show this.

Silvia.
Received on Thursday, 30 August 2012 02:50:02 UTC