Re: Metadata in the VTT file header (bug 15851), use cases (and a need to close this) from Glenn Maynard on 2012-09-02 (public-texttracks@w3.org from September 2012)

From: Glenn Maynard <glenn@zewt.org>
Date: Sun, 2 Sep 2012 12:08:49 -0500
To: Simon Pieters <simonp@opera.com>
Cc: Ian Hickson <ian@hixie.ch>, David Singer <singer@apple.com>, public-texttracks <public-texttracks@w3.org>
Message-ID: <CABirCh9SGXdrr48hDLoo8xUSnEvmas+MKzYD5Yw_DYSV5zpx=A@mail.gmail.com>
On Thu, Aug 30, 2012 at 7:56 PM, Ian Hickson <ian@hixie.ch> wrote:

>  XML wasn't a failure, by a long shot. Nor is JSON. They both have
> situations in which they are useful.
>

You're saying something like "look how ugly and complex XML is because it
tries to be general-purpose, so this is bad too".  There's no comparison to
be drawn; this is nothing like XML.

  For every piece of data, sure. Nobody is suggesting that, as far as I'm
> aware.
>

Language?  Parser changes.  Kind?  Parser changes.  Style?  Parser
changes.  I'm sorry if I see a pattern...

> > I'm not recommending any more than anyone else, I'm just saying we
> > > shouldn't try to second-guess their needs and offer a specific place
> > > for them other than just comments.
> >
> > This does nothing more than store strings; it makes no attempt at
> > arbitrary data types, high-level structure or "second-guessing".
>
> Guessing that you only need to store strings is still a guess. Currently
> there's only one piece of information proposed for WebVTT that makes sense
> to have in VTT that fits in the form of a simple one-line string, the
> language (and it's really a complicated structured data format itself, not
> really freeform string).


Again, "kind".  The structure of the string isn't relevant at the parsing
level (any more than the structure of specific HTML attributes are relevant
to the HTML parser).

There have been a number of other proposals for
> things that might go here, and while I disagree that they make sense in
> VTT, even amongst those not everything is a string -- e.g. "default" is
> really a boolean, not a string, and so we'd have to add conventions on top
> of the format beyond the string value to determine what it means (since
> presumably what matters there is presence/absence, not the value, and the
> empty string would be positive, not negative), if we even tried to use a
> string form to store it, which IMHO is a bad idea (as I think HTML
> attributes have shown for the many boolean attributes there).
>

Of course we'd have to define the meaning of the value, just as the meaning
of particular HTML attributes (and at the conformance level, the list of
valid attributes) needs to be defined.  This only defines the
encapsulation.  (For booleans, I'd be inclined to say "Flags: default",
where the value is a space-separated list of things to enable.)

> I'm glad that's not what we did, then.  We saw that a set of use cases
> > all have the same structure--an identifier and a string--and designed a
> > clean solution that fits them all well.
>
> What use case has identifiers??? I'm not aware of any! There are some that
> have strings (language and styling comes to mind), but the use cases don't
> have identifiers, you'd have to add one to make it make sense to use in
> name-value pairs.
>

Unless you have an opaque list of unlabelled "en\ncaptions\n.cls { color:
red; }\n" data, you're always going to have an identifier somewhere
("Language: en") to know what you're looking at.

 > Of course there have.  "Language" and "kind".
>
> It's not at all clear to me that that information should be inline. That
> will just lead to inconsistent data, as we've seen with e.g. Content-Type
> headers and character encoding labels. As I've said before, for the case
> of "muxing", i.e. where there's data in the file for the purpose of the
> editing workflow, it's not clear that you'd ever want that data to survive
> outside the editing workflow, and so I don't see why we need to define
> anything here.


We've explained in several ways why.  A "workflow" isn't a single suite of
applications, or a single author.  If my editing software doesn't output
the Language and Kind values in a way Joe's muxing software understands,
Joe gets a headache.

 If the data is proprietary, the syntax doesn't need to be standard.
>

We're not talking about proprietary data, but that aside--*most* JSON data
is proprietary, and benefits significantly from a standard syntax.

Blank lines are not meaningful in CSS, so you can just strip them.
>

Newlines aren't meaningful, either, so you may as well argue that inline
stylesheets should be stored as a single line.  Please don't uglify my data.

Within the editing workflow, before publication, editors can use whatever
> format they want. WebVTT does not pretend to be an editing workflow
> format, not would it be anything close to a good choice for such a format.
>

It's just fine for many workflows, just like SRT and SSA.  The actual input
formats you hand off to MKV/WebM muxers (eg. mkvmerge) is precisely these
formats.



On Sun, Sep 2, 2012 at 2:42 AM, Simon Pieters <simonp@opera.com> wrote:

> The main reason I'm against escaping is that CSS already has an escaping
> mechanism, and we all know how confusing it gets when having to deal with
> multiple layers of escaping. The proposed escaping is even inconsistent in
> that the backslash escaping is only applied if it's the first character of
> the line rather than all characters, which makes it even more confusing.


I don't see the inconsistency.  "If the first character is a backslash,
remove it."  That's simple and consistent.


On Sun, Sep 2, 2012 at 2:51 AM, Simon Pieters <simonp@opera.com> wrote:

> Well, if we're reasoning about software allows editing an arbitrary style
> sheet and hides the underlying syntax from the user, the editor can
> serialize the style sheet as a data: URL and put it in an @import.
>

Text formats should be human-readable, even when created by a machine;
encoding data in base64 is essentially making it a binary file.

It's not 100% compatible with the
>> current parser,
>>
>
> Why not?


Because changing step 14 away from "any --> in the header loop breaks out
of headers" to something more specific ("-->" not within a header value)
will cause some files to be parsed differently.  I don't mind that, if it's
not too late.  The proposed period-based format was designed with the
expectation that this would be a "v2" ("later") feature, at which point it
would definitely be too late.

On Sun, Sep 2, 2012 at 4:31 AM, Silvia Pfeiffer
<silviapfeiffer1@gmail.com>wrote:

> E.g. say you're parsing a WebVTT file according to its structure to
> encapsulate
> them into WebM, then you would end up identifying the header until the
> first empty line, then identifying the cues. And as you identify a cue
> that you cannot give a time segment to (because there is none), you
> drop the cue on the floor. This means that a WebM encapsulation would
> always drop an inline style sheet.
>

I'm not quite following.  Dropping the stylesheet is currently correct
(because they don't exist yet), and once the parser defines how to parse
them (whether as a header format or as a special case), it'll stop dropping
them and the WebM muxer can do whatever it wants with the data.

Whether the data is interpreted by the current parser as a bad cue, or
parsed by the placeholder "header" loop, either way the parser just
discards the data.

On Sun, Sep 2, 2012 at 6:58 AM, Silvia Pfeiffer
<silviapfeiffer1@gmail.com>wrote:

> No, it wouldn't. But once we have a mechanism to identify the end of
> the header, if we store and header information in cues, then
> everything that makes a clean separation between headers and cues
> (such as a WebM encoder) will be able to deal with the headers, but
> would drop the cues on the floor. WebM has already made precautions
> for headers and anything that's between the WEBVTT marker and the
> first cue (not the first successfully parsed cue!!) is regarded as the
> header, see
> http://wiki.webmproject.org/webm-metadata/temporal-metadata/webvtt-in-webm
> :
>
> "all the text (up to and excluding the linefeed separator that
> demarcates the file-wide metadata and the first cue) could be stored
> in the CodecPrivate sub-element of the Track element"
>

Then their definition just changes a little, to include all header data
according to the WebVTT parser.  It shouldn't have any effect on the file
format if they're just storing it as a block of text.  It would require a
muxer change, for the modified parser.  This is no different than the
effects such a change would have on any other parser, such as in browsers.

Whether it's "too late" (or whether coming to a resolution on this will
take so long that it will be) or not is a separate question.  (When we were
devising the "."-based proposal, we were assuming it would be.  I think
that's still a safe assumption, but I'm happy if that's not the case.)  I
have no idea if there are actual WebM muxers parsing WebVTT yet, but I
doubt it's used anywhere in the wild yet.

-- 
Glenn Maynard
Received on Sunday, 2 September 2012 17:09:23 UTC