Re: Timed tracks

On Fri, May 7, 2010 at 7:53 AM, Tab Atkins Jr. <> wrote:
> On Thu, May 6, 2010 at 2:25 PM, John Foliot <> wrote:
>> Tab Atkins Jr. wrote:
>>> SRT is the closest-to-ideal existing format,
>> Tab, with all due respect, what documented facts is this bold assertion
>> based upon? <em>URLs would be most appreciated here, as the Media Sub-Group
>> are assembling a needs requirement document at this time.</em>
> All of the use-cases of actual caption use on the web, collected on
> the WHATWG wiki at
>  Additionally, API-level access use-cases for captions on the web have
> been collected at
>> As co-chair of the Media Sub-Group at the W3C Accessibility Task Force for
>> HTML5, active participants (including, significantly, engineers from the
>> related browser manufacturers) have been discussing Time Text formats for
>> some time now, and a recent survey (2010-03-08 to 2010-03-11) of the larger
>> a11y Task Force showed almost equal support for the minimal SRT format as
>> well as a more robust format, likely DFXP/TTML and/or a profile of that.
>> -
>> The general consensus (and others are free to correct me) was that SRT was
>> at best a minimal time-stamp format that could be used, but that it did not
>> meet the 'robustness' test for all aspect of accessibility. Suggesting that
>> it is the "closest-to-ideal" is pure folly and opinion at this time, and
>> does not accurately reflect the opinion of those who are working closely at
>> this subject (again, including engineers from Microsoft, Apple, Opera and
>> Mozilla directly involved with <video> implementation in the browsers). In
>> fact, Maciej himself suggested (in his survey response): "I don't think it's
>> necessary to require a specific format for the initial proposal. It seems
>> like requiring any one format will just make it more controversial."
> Indeed, plain SRT is pretty minimal, and doesn't address many of the
> documented use-cases.  But it's very simple to both author and parse,
> and the extensions needed to make it handle all the aforementioned
> use-cases are pretty minimal.  It's also pretty common, apparently
> especially so amongst amateur subbers, which implies that it probably
> addresses the needs and desires of average authors pretty well.
> It may be that we end up needing to support multiple formats, such as
> perhaps a profile of TTML.  But I'd like to avoid that if at all
> possible, and from what I understand implementors would too.
>>> Even if were to just
>>> say "All right, we're just doing TTML", it would require us to still
>>> produce a spec explaining how TTML's layout primitives should be
>>> interpreted.  Potentially, of course, browsers could just implement
>>> XSL:FO directly, but initial feedback indicates that that's not an
>>> option they're willing to support.  So we'd have to define how all of
>>> that maps into CSS, which would be as much or more work.
>> And those types of discussions are on-going within the W3C Task Force
>> charged with that requirement. As Matt May (and others) have pointed out
>> (,
>> mapping the basic start and end times of any time-stamped document to a
>> basic DFXP profile is not only quite easy, but significantly such a profile
>> currently exists:
>> That DFXP profile (it should be noted) is also already supported by
>> Flash-based video players such as JW-FLV (arguably the most widely deployed
>> Open Source media player on the web today -,
>> NCAM's CCforFlash player ( and
>> Nomensa's Accessible Media Player
>> (
>> DFXP also has support within Silverlight-based media players which suggests
>> that there is also existing DFXP content in the wild today.
>> Further, authoring tools already exist to create these DFXP time-stamp files
>> today: For small and independent authors there is MAGpie*
>> (
>> amongst others, and significantly, the Broadcast Industry is also creating
>> tools to generate DFXP
>> (
>> Given that this class of content producer is likely going to be
>> front-runners in captioned video on the web (especially if legislative
>> initiatives such as H.R. 3101 -
>> - come to fruition)
>> I am curious to know if they were consulted on this new WebSRT format?
>> (*Funding for MAGpie was provided in part by the National Institute on
>> Disability and Rehabilitation Research (NIDRR), the U.S. Department of
>> Education, and the Mitsubishi Electric America Foundation)
> Indeed, we may end up needing to support TTML or DFXP.  But it's much
> more complex in both generation and parsing than we need, and requires
> work to map its formatting into CSS terms.  That effort may be better
> spent elsewhere.
> Seeing as the subtitling ecosystem is pretty diverse, and we can't
> possibly support all the formats out there, a lot of people are going
> to have to do transcoding to another format *anyway* to get their
> stuff on the web.  Making a few more people do it may be a worthwhile
> cost for the benefit of having a single simple format for captioning
> on the web.
>> Continue to expect significant and vocal opposition to this newly
>> re-invented Time-stamp wheel, which apparently sprang to life earlier this
>> week from the editor of the WHAT WG, as a complete and total surprise to
>> Media Captioning experts and Accessibility specialists of all stripes within
>> the W3C (such as Geoff, who's years of involvement within NCAM/WGBH - the
>> 'inventors' of captioning for television "video media" - carries significant
>> weight, research and experience when it comes to understanding both user
>> requirements, as well as an understanding of implementation issues).
> No, the use-cases have been collected for a while, in hand with
> significant effort from Silvia Pfeiffer.  No need to invent a fiction
> of Hixie creating these things out of whole cloth.

I don't want this to stand in the room as though there is blanket
support from me for everything that is going into the HTML5 draft
right now. I have tried to contribute to the effort of requirement
collection, but I have had not participated in the actual spec writing
and I don't agree with everything that is there.

In fact, I haven't actually made up my mind whether WebSRT is a good
thing. Certainly, any caption support in HTML5 IMO should provide
support for SRT, but not exclusively so. I do regard WebSRT as a new
format, similar to how HTML5 is a new format in comparison to HTML4.

I am uncertain if WebSRT will satisfy all the requirements that we
will ever have for an external text format, but from what my early
glances can tell, it is pretty well suited to most current needs. It
is limited, still, which is both an advantage and a disadvantage. On a
scale of features that an external text format provides, SRT is very
strongly on the low side and WebSRT is strongly on the 80% side of
needs. I do think that ultimately we want to complete this with a
format that supports as much as possible, i.e. something that is
essentially HTML but attached to time stamps. Then we would cover all

Given this, I am strongly in favour of having any external associated
text format specified independently from the HTML5 specification. It
will also help authoring of such files, since they will not just be
used in the Web context. If for example another standards body would
want to adopt WebSRT into their specifications, they would really need
a separate document to reference. Also, for registering a MIME type, a
separate document makes a lot more sense. Further, somebody trying to
learn WebSRT would probably be highly confused by it being just a
section in the huge HTML5 spec, when they in fact have no interest in


Received on Friday, 7 May 2010 00:25:35 UTC