- From: Sean Hayes <Sean.Hayes@microsoft.com>
- Date: Sun, 13 Feb 2011 14:27:09 +0000
- To: Silvia Pfeiffer <silviapfeiffer1@gmail.com>, David Singer <singer@apple.com>
- CC: HTML Accessibility Task Force <public-html-a11y@w3.org>
Well there are multiple ways to produce captions from a live source and deliver them, and this has been done with TTML; but actually the format itself isn't really the hard part, which is capturing the text, reducing the latency and then getting them muxed in and delivered on time with the appropriate time stamps. Obviously one wouldn't try and deliver live produced captions in a single file, but it can be a reasonable way to send them for packaged media, whether delivered as a stream or not. The ideal is to be flexible and allow for a continuum between the all in one, and every caption separate approach. IMO this discussion was to address some of David's technical concerns over TTML, since this is clearly not a forum to try and influence the VPAAC. WRT HTML5, I believe that the chairs and PLH are still working on action 193, but as I understand it the general idea is that HTML5 should not specify a format itself; but reference one or more formats specified elsewhere in the W3C. -----Original Message----- From: Silvia Pfeiffer [mailto:silviapfeiffer1@gmail.com] Sent: 13 February 2011 11:24 To: David Singer Cc: Sean Hayes; HTML Accessibility Task Force Subject: Re: using TTML for caption delivery, discussion On Sun, Feb 13, 2011 at 8:37 PM, David Singer <singer@apple.com> wrote: > > On Feb 13, 2011, at 8:23 , Sean Hayes wrote: > >> Point 3. Single or chunked delivery >> Since the typical size of a caption file is only on the order of 10s of Kb, maybe 100Kb or so for a long form movie, actually receiving it all up front and parsing it in one go isn't that much of a problem, and generally an advantage. The only time it would be an issue is in delivery of live content where you don't know the captions in advance, and as far as I can tell that's not a use case that is supported by the <video> tag today. > > The video tag can point at anything, including RTSP controlled streams, and particularly it can point at chunked-over-HTTP manifest files, and even when pointing at a http: URL for a media file, byte-range access for time ranges can work. Nothing in HTML says it has to be an http: URL, and nothing says it has to be a simple from-the-beginning simple download. I've actually seen the video element in use for live video streaming, so it's not just theoretically possible, but actually in active use. We have to make sure we can deliver captions in such scenarios and one requirement for this is that captions are interleaved with the audio-visual data in a time-synchronized stream. >> Integrating TTML into MPEG4 again is fairly easy due to the small size, it can simply all fit in one XML box. Or be delivered as multiple segments in a trak. This has been defined for DECE and could be adopted into MPEG. > > Whole documents do sound 'heavy' though. Not just heavy - they, in fact, make live streaming with live created captions impossible, since these could only be created interleaved with the audio-visual data. Delivery in one XML box is definitely not the best solution for the caption problem. >> Point 4. Profiles. >> There is a fairly comprehensive profiling mechanism built into TTML, > > but it only seems to allow covering language features, not characteristics of the stream (like, that it's in time order) or other functional aspects (like, CSS styling support), right? > >> >> So I don't believe we actually achieve very much in the real world by trying to make a decision now. In a few years we may be able to see which format is gaining most ground in practice and make a decision then. The thing to do today is to ship HTML5 so that captioning is not precluded, allow pioneering content authors to write caption content in any format they choose, and wait and see how the browser vendors do on implementing <track> natively over time. > > Total agreement here. I have heard rumblings of a suggested mandate for TTML, whereas I would prefer to agree with you and get some experience doing captioning in specific and accessibility support in general before we see mandates. I assume we are talking about the FCC VPAAC discussion here and not what we should recommend for HTML5 (given that the deicions of caption format in the HTML5 spec has been resolved IMHO)? It would indeed be good if the FCC didn't recommend a format, but rather only specified requirements that a format has to meet. Cheers, Silvia.
Received on Sunday, 13 February 2011 14:27:46 UTC