RE: Timed tracks

"people making high-end integrated solution" is an ad hominem
argument. I don't understand your argument for simplicity.

In what way is sticking the captioning or subtitle or
timed text information in the HTML any simpler, from anyone's
point of view? Is it simpler for authors? Simpler for people
who want to embed a video into a web page, and have to copy
not only the link to the video but also the timed tracks
information? Simpler for browser implementors?

When I think about workflows for producing, editing,
linking, copying, referencing, and otherwise using 
video on the web, associating timed text directly with the video
rather than embedding it in the HTML seems like it is
"simpler" for the end users.

And from an implementation point of view, if you're going to
implement some way of viewing/accessing/retrieving timed tracks
directly associated with the video without embedding the
timed tracks in the HTML itself, well, having to support
it directly embedded in the HTML itself just adds to the
implementation complexity.

What if the video stream is long? You'd bloat the
HTML with the timed text, which would interfere with the
ability to load the page quickly, even if the video is
not auto-play.

What if the video stream is actually dynamically generated,
and the "timed text" not available?


-----Original Message-----
From: Joe D Williams [] 
Sent: Sunday, May 16, 2010 4:27 PM
To: Larry Masinter; 'Jonas Sicking'
Cc: 'Henri Sivonen'; 'Julian Reschke'; 'Ian Hickson'; 'Philippe Le
Hegaret'; 'Edward O'Connor';; 'Anne van Kesteren'
Subject: Re: Timed tracks

> The fact that HTML is used in a wide variety of contexts ...

RIght, this timed track is not a big deal. Well, how to expose the 
info to the DOM, may be. Sure the video format people should decide 
upon a form that all dedicated higest perfomance most features branded

web/home/broadcast/hobby/professional video player  can play, but this

is more like the basic question of which basic controls (play, stop, 
etc.) should be exposed in HTML5 user code. Why not opt for the 
simplist possible markup:
<video ...>
<track whatever>
 <caption frametime='00:00:00.00' string='Start of Show'>
  [one capiton for each time]

I haven't seen all the examples so maybe there are better names but 
the main idea is to expose info to the host web browser by placing 
some content in the DOM so the 'native' video player of the host html5

browser or optionally the 'native' video player itself, or optionally 
some acessibiity tools can deal with it.

More detailed expressions are natural because the majority of the 
interest and effort for making this a 'complete' standards-track 
solution for adding captions/interactivty to video comes from the 
people making high-end integrated solutions where all this stuff is 
metasemantics included in the video file and the video player that 
processes this stuff is more like a high performance 'plugin' as embed

or object content than the vision of a competent but understood to be 
limited 'native' video player that is able to play some minimum 
formats which may optionally include captioning/interactivity defined 
by some content in the html5 user code.

So, for html5, keep this real simple to implement and access via the 
DOM. Let those folks making big time commercial plugins deal with the 
big problems and history far future and all that and just give me 
something that I can use that a simple html5 <video> element can 

Thanks to All and Best Regards,

----- Original Message ----- 
From: "Larry Masinter" <>
To: "'Jonas Sicking'" <>
Cc: "'Henri Sivonen'" <>; "'Julian Reschke'" 
<>; "'Ian Hickson'" <>; "'Philippe Le

Hegaret'" <>; "'Edward O'Connor'" <>; 
<>; "'Anne van Kesteren'" <>
Sent: Wednesday, May 12, 2010 10:20 PM
Subject: RE: Timed tracks

Of course, because a specific kind of device or component
or agent is a source of legitimate use cases does not
imply that every aspect of the operation of the device
is in scope; otherwise we might be talking about voltage
regulators and the electrical properties of HDMI interfaces.

The fact that HTML is used in a wide variety of contexts is
a strong argument for modularity and separation of concerns,
not of leaving topics that *can* be orthogonal in scope.

And I think that's an issue that was resolved back in


-----Original Message-----
From: Jonas Sicking []
Sent: Wednesday, May 12, 2010 11:38 AM
To: Larry Masinter
Cc: Henri Sivonen; Julian Reschke; Ian Hickson; Philippe Le Hegaret;
Edward O'Connor;; Anne van Kesteren
Subject: Re: Timed tracks

On Wed, May 12, 2010 at 11:30 AM, Larry Masinter <> wrote:
> # To me, your question seems totally irrelevant to this WG.
> # If the $99.99 device contains a Web browser, then yes.
> # If it doesn't contain a Web browser, the capabilities of
> # the device are not relevant to <video>.
> The working group is chartered to work on a definition of the
> Hypertext Markup Language and its related APIs, not on the
> definition of a "Web browser".
> A device which can parse conforming HTML, find appropriate
> <video> elements within it, and then play the video,
> with captions, is a perfectly acceptable use case for
> determining requirements for the HyperText Markup Language.

For what it's worth, I'm happy to keep the work on WebSRT in the
WhatWG working group. We can always submit it to the W3C once its a
more stable proposal. That would seem allow us to work on the
technical aspects of the spec in parallel with solving the complex
question of which working group should handle it.

/ Jonas

Received on Sunday, 16 May 2010 23:57:50 UTC