Re: metadata in the VTT file header, re-starting the conversation from Glenn Maynard on 2012-02-24 (public-texttracks@w3.org from February 2012)

From: Glenn Maynard <glenn@zewt.org>
Date: Thu, 23 Feb 2012 18:23:45 -0600
To: Silvia Pfeiffer <silviapfeiffer1@gmail.com>
Cc: Philip Jägenstedt <philipj@opera.com>, public-texttracks@w3.org
Message-ID: <CABirCh_YoL2RRE29f2V0xkDvrmy=0DV-uUWLhy=iyGtsmfWD9w@mail.gmail.com>

Three different usage scenarios are:

1: .VTT tracks defined in HTML.
2: .VTT tracks embedded in a container like WebM.
3: Loose .VTT tracks, in a directory alongside a video.

I don't think the types of metadata you're describing (mirroring <track>)
are necessarily important for #2, since WebM, etc. should define a way to
embed that on its own (as you mention they're working on).  If the
information has to be loaded out of each .VTT file, it could require a lot
of seeking around the file to load it; slow on optical media, even if it
happens to be stored in the same file.

Mirroring that information only seems important for #3.  That case is
uncommon, but it does happen.  I can't decide if the problem I mention
below is worth the relative infrequency of this use case...

I suppose it might also be convenient for authoring, eg. so extracting a
.VTT from a WebM file can include the metadata inline instead of having to
somehow output an HTML stub, and so WebM muxing tools don't have to be able
to parse HTML to read the metadata to be stored in the output file.

On Thu, Feb 23, 2012 at 5:07 AM, Silvia Pfeiffer
<silviapfeiffer1@gmail.com>wrote:

> They have to react differently to the data in the cues depending on
> whether it is a caption/subtitle, a description, a chapter or a
> metadata file. So this information is vital to have.
>
> Also, the information as to what language the track is in would be
> very important to display in the list of available caption tracks. For
> example, VLC currently loads all the SRT tracks for a video that are
> in the same directory, but only displays them as "track1", "track2",
> "track3", etc. which is pretty useless from a UI POV. Instead, if
> there was a normative location to describe the language, VLC could
> display that language.
>

My biggest concern is that metadata in <video> is guaranteed to be out of
sync with metadata in the .VTT header in many files, and many people won't
set it at all.  They'll never notice a problem, since it'll work fine for
them in browsers, which will use the <track> information.

I'm nervous about introducing data redundancy that we know for sure will
lead to inconsistencies...

Maybe if WebM muxers/demuxers and other tools depend on these headers
(instead of reading HTML <video> snippets or something similarly annoying),
it'll help encourage people to use it properly, but it still seems like a
losing battle.

Of course, I'm only talking about the data overlap between this sort of
metadata and <track>.  This isn't a problem with metadata that doesn't
overlap with <track>, such as default cue properties.

> I'll have to read up on the WebM metadata thread soon, because I don't see
> > why it would be dependent on the format WebVTT uses.
>
> It's here:
> http://wiki.webmproject.org/webm-metadata/temporal-metadata/webvtt-in-webm
>

This doesn't mention how to deal with external CSS files and fonts.  I
don't know if that's implicitly defined by existing WebM mechanisms or just
something they haven't figured out yet.

It also says: "This is how roll-up captions work: multiple cues are
rendered simultaneously, and when the top cue expires, the other cues move
up and a new cue appears at the bottom."  I don't know why it says that,
since WebVTT doesn't do roll-up captions.

(I don't have the bandwidth to join WebM lists to ask about these things,
so I'd just ask anyone involved in those discussions who thinks any of this
is worth mentioning to do so.)

> metadata is stored in CodecPrivate etc.

(It doesn't look like that's what it's currently suggesting, FYI: "no
WebVTT data is stored in the CodecPrivate element of the WebM Track
header".  It's a wiki, so maybe it changed since you read it last.)

-- 
Glenn Maynard

Received on Friday, 24 February 2012 00:24:13 UTC