Re: Metadata in the VTT file header (bug 15851), use cases (and a need to close this)

On Aug 29, 2012, at 16:53 , Ian Hickson <ian@hixie.ch> wrote:

> On Wed, 29 Aug 2012, David Singer wrote:
>> 
>> 1) Authoring.  Quite often caption files are authored/written in a 
>> different workflow from the media, and must be re-united later. We'd 
>> like to keep track of attributes of the files in-band, so that they 
>> don't get lost (e.g. the language of the captions), and indeed, of the 
>> proposed values for the <track> element attributes when the file is 
>> referenced from HTML. It can also be useful to include a link-back to 
>> the content that was captioned, using an identifier (e.g. URL).
> 
> This would be entirely addressed by in-file comments, and doesn't need 
> name-value pairs.

Really?  I thought comments were free-form, and HTML5 attributes had a name and a value.  Perhaps you could indicate how software could parse the 'comments' to form the initial/suggested attribute values?

> In fact name-value pairs wouldn't address the problem 
> sufficiently, since some people have data that isn't name-value pairs 
> (e.g. an author might want to include name,language,value tuples, or 
> binary data, or structured data). In addition, author-specific workflow 
> data doesn't need to follow a standard, since it only needs to be 
> interoperable within the application the user uses.

Caption houses are often separate from authoring houses, that are separate from the content distribution house;  this is what interoperability is about. :-)  Indeed, caption houses often have the rights to database captions but not the content, and content houses the opposite (I know, it seems bizarre that Disney doesn't retain the rights to captions they paid to have made, but, it's often the case).

> So this is not a use case for a name-value pair metadata header.

I disagree.

> Comment blocks are the topic of:
>   https://www.w3.org/Bugs/Public/show_bug.cgi?id=14552

Comments are fine too, but a separate discussion.

>> 2) Use in other embeddings.  MPEG has started work on specifying MP4 
>> carriage of WebVTT in a track of the MP4 file. In this context, we need 
>> some of the attributes that are carried in the HTML layer.  Some are 
>> already covered or partially covered (e.g. all tracks can carry a 
>> language in MP4) but not all.  WebM embedding is also under way.
> 
> This should be at the container level, not in VTT, IMHO. It is trivial for 
> a container format to define how to include such information; even in the 
> case of a format that can only embed data directly, the payload format can 
> always be defined as being the data from the <track> element followed by 
> the WebVTT data itself.
> 
> So this is not a use case for a name-value pair metadata header within 
> WebVTT itself either.

It's true that most containers have some provision for some of this.  However, material that is specific to VTT is best placed within it, IMHO.

>> 3) Side-band use in other contexts. In some delivery scenarios, it makes 
>> sense for WebVTT caption files not be embedded but carried in a 
>> 'side-band' (e.g. in HTTP streaming systems), that is, loaded as a 
>> side-file. In this case, we need the ability to carry attributes that 
>> the referencing file does not carry.
> 
> Can you elaborate on this use case? What attributes? Why?

The attributes that the VTT file would have had if it had been embedded in HTML, for a start;  language, kind, to name two explicitly.

>> 4) Style-sheets.  Maybe it's satisfactory to define that WebVTT inherits 
>> styling from its container (e.g. HTML5), but in the case where the 
>> container doesn't carry styling (e.g. HTTP streaming, MP4), or in the 
>> case where specific styling is needed for the WebVTT, we need to be able 
>> to reference or include style sheets in the WebVTT layer itself. As an 
>> example, a style-sheet giving 608/708 appearance is being worked on as 
>> part of the 608/708 conversion.
> 
> This is handled by the proposal(s) in:
>   https://www.w3.org/Bugs/Public/show_bug.cgi?id=15023
> 
> This is not a use case for a name-value pair metadata header.

Except that in that very bug, Glenn helpfully formats the example into exactly this general syntax.

>> 
>> 5) Time alignment. When WebVTT is used as the caption source for a 
>> system where timestamps are from an arbitrary origin (e.g. a continuous 
>> MPEG-2 Transport stream) we need a way to say that 'timestamp X in this 
>> VTT file aligns with Timestamp Y in the media stream' so as to get 
>> synchronization.  This is naturally put into the header.
> 
> If there's a WebVTT file with fixed timestamps and a media stream with 
> arbitrary timestamps, then the only place where it makes sense to put the 
> synchronisation information is in the media stream. Putting it in the 
> WebVTT stream makes no sense; if you are able to adjust that stream then 
> why not just adjust the timestamps?

Pardon?  You're suggesting completely re-writing the timestamps in the mpeg-2 transport stream so as to … do exactly what?  What we need is a mapping, not a need to re-write whole streams.

> Or even better, don't have arbitrary time stamps, and have both the media 
> stream and the captions use the same timeline.

Great, are you ready for a new timestamp syntax in VTT, and a need to re-write the entire file, when all that is needed is a mapping?  Also, what about the case (not unusual) when a pre-authored piece of content is used as part of a broadcast?  In the broadcast, the media will have a continuous timestamp flow that, effectively, has an arbitrary origin.  When the pre-authored piece is transmitted, we want to align the pre-authored caption file with its media.  That is much more natural than re-writing the entire caption file, or (dream on) re-writing the entire transport stream.

> Thus this is not a use case for a name-value pair metadata header in VTT.

Again, I disagree.

Can you explain why you want to resist what many of us see as a natural direction to go?  You even proposed a syntax for it, yet you seem to be reaching for reasons not to do it.

David Singer
Multimedia and Software Standards, Apple Inc.

Received on Thursday, 30 August 2012 00:13:32 UTC