Re: Timed tracks from Tab Atkins Jr. on 2010-05-06 (public-html@w3.org from May 2010)

From: Tab Atkins Jr. <jackalmage@gmail.com>
Date: Thu, 6 May 2010 14:53:20 -0700
To: John Foliot <jfoliot@stanford.edu>
Cc: Geoff Freed <geoff_freed@wgbh.org>, Maciej Stachowiak <mjs@apple.com>, Philippe Le Hegaret <plh@w3.org>, "Edward O'Connor" <hober0@gmail.com>, Ian Hickson <ian@hixie.ch>, public-html@w3.org
Message-ID: <v2pdd0fbad1005061453ted3176c8yacecae6106f3069b@mail.gmail.com>
On Thu, May 6, 2010 at 2:25 PM, John Foliot <jfoliot@stanford.edu> wrote:
> Tab Atkins Jr. wrote:
>>
>> SRT is the closest-to-ideal existing format,
>
> Tab, with all due respect, what documented facts is this bold assertion
> based upon? <em>URLs would be most appreciated here, as the Media Sub-Group
> are assembling a needs requirement document at this time.</em>

All of the use-cases of actual caption use on the web, collected on
the WHATWG wiki at
http://wiki.whatwg.org/wiki/Use_cases_for_timed_tracks_rendered_over_video_by_the_UA.
 Additionally, API-level access use-cases for captions on the web have
been collected at
http://wiki.whatwg.org/wiki/Use_cases_for_API-level_access_to_timed_tracks.


> As co-chair of the Media Sub-Group at the W3C Accessibility Task Force for
> HTML5, active participants (including, significantly, engineers from the
> related browser manufacturers) have been discussing Time Text formats for
> some time now, and a recent survey (2010-03-08 to 2010-03-11) of the larger
> a11y Task Force showed almost equal support for the minimal SRT format as
> well as a more robust format, likely DFXP/TTML and/or a profile of that.
> - http://www.w3.org/2002/09/wbs/44061/media-text-format/results
>
> The general consensus (and others are free to correct me) was that SRT was
> at best a minimal time-stamp format that could be used, but that it did not
> meet the 'robustness' test for all aspect of accessibility. Suggesting that
> it is the "closest-to-ideal" is pure folly and opinion at this time, and
> does not accurately reflect the opinion of those who are working closely at
> this subject (again, including engineers from Microsoft, Apple, Opera and
> Mozilla directly involved with <video> implementation in the browsers). In
> fact, Maciej himself suggested (in his survey response): "I don't think it's
> necessary to require a specific format for the initial proposal. It seems
> like requiring any one format will just make it more controversial."

Indeed, plain SRT is pretty minimal, and doesn't address many of the
documented use-cases.  But it's very simple to both author and parse,
and the extensions needed to make it handle all the aforementioned
use-cases are pretty minimal.  It's also pretty common, apparently
especially so amongst amateur subbers, which implies that it probably
addresses the needs and desires of average authors pretty well.

It may be that we end up needing to support multiple formats, such as
perhaps a profile of TTML.  But I'd like to avoid that if at all
possible, and from what I understand implementors would too.


>> Even if were to just
>> say "All right, we're just doing TTML", it would require us to still
>> produce a spec explaining how TTML's layout primitives should be
>> interpreted.  Potentially, of course, browsers could just implement
>> XSL:FO directly, but initial feedback indicates that that's not an
>> option they're willing to support.  So we'd have to define how all of
>> that maps into CSS, which would be as much or more work.
>
> And those types of discussions are on-going within the W3C Task Force
> charged with that requirement. As Matt May (and others) have pointed out
> (http://lists.w3.org/Archives/Public/public-html-a11y/2010Mar/0102.html),
> mapping the basic start and end times of any time-stamped document to a
> basic DFXP profile is not only quite easy, but significantly such a profile
> currently exists:
> http://www.w3.org/TR/ttaf1-dfxp/#profile-dfxp-presentation.
>
> That DFXP profile (it should be noted) is also already supported by
> Flash-based video players such as JW-FLV (arguably the most widely deployed
> Open Source media player on the web today - http://longtailvideo.com),
> NCAM's CCforFlash player (http://ncam.wgbh.org/webaccess/ccforflash/) and
> Nomensa's Accessible Media Player
> (http://www.nomensa.com/web-accessibility/what-we-do/accessible-media-player).
> DFXP also has support within Silverlight-based media players which suggests
> that there is also existing DFXP content in the wild today.
>
> Further, authoring tools already exist to create these DFXP time-stamp files
> today: For small and independent authors there is MAGpie*
> (http://ncam.wgbh.org/invent_build/web_multimedia/tools-guidelines/magpie)
> amongst others, and significantly, the Broadcast Industry is also creating
> tools to generate DFXP
> (http://broadcastengineering.com/automation/ninsight-unveils-dfxp-subtitling-mxf-ayoto-0901/)
> Given that this class of content producer is likely going to be
> front-runners in captioned video on the web (especially if legislative
> initiatives such as H.R. 3101 -
> http://www.govtrack.us/congress/bill.xpd?bill=h111-3101 - come to fruition)
> I am curious to know if they were consulted on this new WebSRT format?
>
> (*Funding for MAGpie was provided in part by the National Institute on
> Disability and Rehabilitation Research (NIDRR), the U.S. Department of
> Education, and the Mitsubishi Electric America Foundation)

Indeed, we may end up needing to support TTML or DFXP.  But it's much
more complex in both generation and parsing than we need, and requires
work to map its formatting into CSS terms.  That effort may be better
spent elsewhere.

Seeing as the subtitling ecosystem is pretty diverse, and we can't
possibly support all the formats out there, a lot of people are going
to have to do transcoding to another format *anyway* to get their
stuff on the web.  Making a few more people do it may be a worthwhile
cost for the benefit of having a single simple format for captioning
on the web.


> Continue to expect significant and vocal opposition to this newly
> re-invented Time-stamp wheel, which apparently sprang to life earlier this
> week from the editor of the WHAT WG, as a complete and total surprise to
> Media Captioning experts and Accessibility specialists of all stripes within
> the W3C (such as Geoff, who's years of involvement within NCAM/WGBH - the
> 'inventors' of captioning for television "video media" - carries significant
> weight, research and experience when it comes to understanding both user
> requirements, as well as an understanding of implementation issues).

No, the use-cases have been collected for a while, in hand with
significant effort from Silvia Pfeiffer.  No need to invent a fiction
of Hixie creating these things out of whole cloth.

~TJ
Received on Thursday, 6 May 2010 21:54:14 UTC