- From: Silvia Pfeiffer <silviapfeiffer1@gmail.com>
- Date: Thu, 10 May 2012 10:27:49 +1000
- To: Glenn Maynard <glenn@zewt.org>
- Cc: Philip Jägenstedt <philipj@opera.com>, public-texttracks@w3.org
(Getting back to this discussion that I had lost track of ..) On Fri, Feb 24, 2012 at 11:23 AM, Glenn Maynard <glenn@zewt.org> wrote: > Three different usage scenarios are: > > 1: .VTT tracks defined in HTML. > 2: .VTT tracks embedded in a container like WebM. > 3: Loose .VTT tracks, in a directory alongside a video. Agreed. > I don't think the types of metadata you're describing (mirroring <track>) > are necessarily important for #2, since WebM, etc. should define a way to > embed that on its own (as you mention they're working on). They are, but they are looking for use case #3 to provide what needs to be embedded into the video. That is indeed the most common way that data is presented to an muxing program: the video file plus the individual tracks with all the information that is required for the encapsulation. Sometimes you can overwrite the information given in the track with command-line parameters. But never have I heard of a muxing program that reads a HTML file to get its metadata. > If the > information has to be loaded out of each .VTT file, it could require a lot > of seeking around the file to load it; slow on optical media, even if it > happens to be stored in the same file. We're only talking about header-style metadata. There is no seeking around required: it comes straight after the WEBVTT magic string. > Mirroring that information only seems important for #3. That case is > uncommon, but it does happen. I can't decide if the problem I mention below > is worth the relative infrequency of this use case... I don't know where you get your statistics, but almost all usage of SRT files on a desktop work the #3 way and they all fall short of the metadata problem, which is something we don't want to repeat with WebVTT. The rest of the desktop use cases (in particular MPEG-4 and QuickTime files) have it muxed in-band, i.e. the #2 case. We're introducing #1 because it's the Web way, but it's a new way and by far not the most common way yet. > I suppose it might also be convenient for authoring, eg. so extracting a > .VTT from a WebM file can include the metadata inline instead of having to > somehow output an HTML stub, and so WebM muxing tools don't have to be able > to parse HTML to read the metadata to be stored in the output file. Agreed. > On Thu, Feb 23, 2012 at 5:07 AM, Silvia Pfeiffer <silviapfeiffer1@gmail.com> > wrote: >> >> They have to react differently to the data in the cues depending on >> whether it is a caption/subtitle, a description, a chapter or a >> metadata file. So this information is vital to have. >> >> Also, the information as to what language the track is in would be >> very important to display in the list of available caption tracks. For >> example, VLC currently loads all the SRT tracks for a video that are >> in the same directory, but only displays them as "track1", "track2", >> "track3", etc. which is pretty useless from a UI POV. Instead, if >> there was a normative location to describe the language, VLC could >> display that language. > > > My biggest concern is that metadata in <video> is guaranteed to be out of > sync with metadata in the .VTT header in many files, and many people won't > set it at all. They'll never notice a problem, since it'll work fine for > them in browsers, which will use the <track> information. > > I'm nervous about introducing data redundancy that we know for sure will > lead to inconsistencies... I don't regard that as a problem, but as an opportunity. The file itself has one set of metadata. That's data that the Web Dev can decide to use. Or instead they can decide to overrule it with specific directions in the <track> element. For example, if I had a Web Server that is managing a collection of WebVTT files, I would most likely have the WebVTT files managed and created by somebody who has nothing to do with the Website. I'd either make sure the file I am given have the right metadata inside them, or if I don't trust the files I'd ignore the metadata. I would most likely create the attributes of a <track> element by analysing the content of the WebVTT files that I am serving and just hand that data through. In this way the browser gets all the information that it needs out of the WebVTT file without actually having to download and parse anything from the WebVTT file. It's proxied information, not redundant information. > Maybe if WebM muxers/demuxers and other tools depend on these headers > (instead of reading HTML <video> snippets or something similarly annoying), > it'll help encourage people to use it properly, but it still seems like a > losing battle. That's like saying you can't trust any information given to you in files. In the end, you have to be able to rely on some data: either you rely on the Web dev doing the correct thing or you rely on the WebVTT author doing the right thing. Who can you rely on more? If done properly, the Web dev will just use what's in the file, and the WebVTT author will be the one making sure the file is correct. >> > I'll have to read up on the WebM metadata thread soon, because I don't >> > see >> > why it would be dependent on the format WebVTT uses. >> >> It's here: >> http://wiki.webmproject.org/webm-metadata/temporal-metadata/webvtt-in-webm > > > This doesn't mention how to deal with external CSS files and fonts. I don't > know if that's implicitly defined by existing WebM mechanisms or just > something they haven't figured out yet. We haven't figured out how to deal with external CSS and WebVTT for non-browser apps either. The WebM mechanism will simply rely on whatever we come up with. If it's independent files that have to be delivered with the media and the WebVTT file (maybe in a zip file), then that works for WebM. I'm wary of putting a file name into WebVTT - I'd much rather leave it informally to be delievered in zip files with same names. In-line css in WebVTT headers would also work for WebM. > It also says: "This is how roll-up captions work: multiple cues are rendered > simultaneously, and when the top cue expires, the other cues move up and a > new cue appears at the bottom." I don't know why it says that, since WebVTT > doesn't do roll-up captions. Such a shame, isn't it! Just look at: http://www.youtube.com/watch?v=oxkZTF-7Lgw - how will we do that with WebVTT? > (I don't have the bandwidth to join WebM lists to ask about these things, so > I'd just ask anyone involved in those discussions who thinks any of this is > worth mentioning to do so.) No worries. I can be the proxy. >> metadata is stored in CodecPrivate etc. > > (It doesn't look like that's what it's currently suggesting, FYI: "no WebVTT > data is stored in the CodecPrivate element of the WebM Track header". It's > a wiki, so maybe it changed since you read it last.) You're mis-reading. This refers to storing no payload data (i.e. no CUES) into the CodecPrivate header. Further down it says: "File-wide metadata does not have a timestamp, so all the text (up to and excluding the linefeed separator that demarcates the file-wide metadata and the first cue) could be stored in the CodecPrivate sub-element of the Track element." Regards, Silvia.
Received on Thursday, 10 May 2012 00:28:40 UTC