- From: Silvia Pfeiffer <silviapfeiffer1@gmail.com>
- Date: Thu, 16 Jul 2009 15:58:30 +1000
Hi Ian, Great to see the new efforts to move the subtitle/caption/karaoke issues forward! I actually have a contract with Mozilla starting this month to help solve this, so I am more than grateful that you have proposed some ideas in this space. On Thu, Jul 16, 2009 at 9:38 AM, Ian Hickson<ian at hixie.ch> wrote: > On Sat, 27 Dec 2008, Silvia Pfeiffer wrote: >> > 1. Timed text in the resource itself (or linked from the resource >> > itself), rendered as part of the video automatically by the user >> > agent. >> >> For case 1, the practical implications are that browser vendors will >> have to develop support for a large variety of text codecs, each one >> providing different functionalities. > > I would hope that as with a video codec, we can standardise on a single > subtitle format, ideally some simple media-independent combination of SRT > and LRC [1]. It's difficult to solve this problem without a standard > codec, though. I have myself thought about creating a new format to address the needs for time-aligned text in audio/video. However, the problem with creating a new format is that you start from scratch and already spreaded formats are not supported. I can see that your proposed format is trying to be backwards compatible with SRT, so at least it would work for the large number of existing srt file collections. I am still skeptical, in particular because there are no authoring systems for this format around. But I would be curious what others think about your proposed SRT-LRC-mix. >> In fact, the easiest solution would be if that particular format was >> really only HTML. > > IMHO that would be absurd. HTML means scripting, embedded videos, an > unbelivably complex rendering system, complex parsing, etc; plus, what's > more, it doesn't even support timing yet, so we'd have to add all the > timing and karaoke features on top of it. Requiring that video players > embed a timed HTML renderer just to render subtitles is like saying that > we should ship Microsoft Word with every DVD player, to handle the user > input when the user wants to type in a new chapter number to jump to. I agree, it cannot be a format that contains all the complexity of HTML. It would only support a subpart of HTML that is relevant, plus the addition of timing - and in this case is indeed a new format. I have therefore changed my mind since I sent that email in Dec 08 and am hoping we can do it with existing formats. In particular, I have taken an in-depth look at the latest specification from the Timed Text working group that have put years of experiments and decades of experience into developing DFXP. You can see my review of DFXP here: http://blog.gingertech.net/2009/06/28/a-review-of-the-w3c-timed-text-authoring-format/ . I think it is both too flexible in a lot of ways, but also too restrictive in others. However, it is a well formulated format that is also getting market traction. In addition, it is possible to formulate profiles to add missing functionality. If we want a quick and dirty hack, srt itself is probably the best solution. If we want a well thought-out solution, DFXP is probably a better idea. I am currently experimenting with these and will be able to share something soon for further discussion. >> > 3. Timed text stored in a separate file, which is then parsed by the >> > user agent and rendered as part of the video automatically by the >> > browser. >> > >> Maybe we should consider solving this differently. Either we could >> encapsulate into the video container upon download. Or we could create a >> zip-file or tarball upon download. I'd just find it a big mistake to >> ignore the majority use case in the standard, which is why I proposed >> the <text> elements inside the <video> tag. > > If browser vendors are willing to merge subtitles and video files when > saving them, that would be great. Is this easy to do? My suggestion was really about doing this server-side, which we have already implemented years ago in the Annodex project for Ogg Theora/Vorbis. However, it is also possible to do this in the browser: in the case of Ogg, the browser just needs to have a multiplexing library installed as well as a means to encode the subtitle file (which I like to call a "text codec"). Since it's text, it's nowhere near as complex as encoding audio or video and just consists of light-weight packaging code. So, yes, it is totally possible to have the browsers create a binary video file that has the subtitles encapsulated that were previously only accessible as referenced text files behind a separate URL. The only issue I see is the baseline codec issue: every browser that wants to support multiple media formats has to implement this multiplexing and text encoding for every media encapsulation format differently, which is annoying and increases complexity. It's however generally a small amount of complexity compared to the complexity created by having to support multiple codecs. >> Here is my example again: >> <video src="http://example.com/video.ogv" controls> >> ?<text category="CC" lang="en" type="text/x-srt" src="caption.srt"></text> >> ?<text category="SUB" lang="de" type="application/ttaf+xml" src="german.dfxp"></text> >> ?<text category="SUB" lang="jp" type="application/smil" src="japanese.smil"></text> >> ?<text category="SUB" lang="fr" type="text/x-srt" src="translation_webservice/fr/caption.srt"></text> >> </video> > > Here's a counterproposal: > > ? <video src="http://example.com/video.ogv" > ? ? ? ? ?subtitles="http://example.com/caption.srt" controls> > ? </video> Subtitle files are created to enable users to choose the text in the language that they speak to be displayed. With a simple addition like what you are proposing, I don't think such a choice is possible. Or do you have a proposal on how to choose the adequate language file? Also, the attributes on the proposed text element of course serve a purpose: * the "category" attribute is meant to provide a default for styling the text track, * the "language" attribute is meant to provide a means to build a menu to choose the adequate subtitle file from, * the "type" attribute is meant to both identify the mime type of the format and the character set used in the file. The character set question is actually a really difficult problem to get right, because srt files are created in an appropriate character set for the language, but there is no means to store in a srt file what character set was used in its creation. That's a really bad situation to be in for the Web server, who can then only take an educated guess. By giving the ability to the HTML author to specify the charset of the srt file with the link, this can be solved. BTW: my latest experiments with subtitles have even a few more attributes. I am not ready to publish that yet, but should be within a week or so and will be glad to have a further discussion then. > I think this would be fine, on the long term. I don't think the existing > implementations of <video> are at a point yet where it makes sense to > define this yet, though. I think we have to start discussing it and doing experiments. I think <video> is getting stable enough to move forward. I'm expecting a period of discussion and experimentation with time-aligned text both in-band and out-of-band, so it's good to get started on this rather sooner than later. > It would be interesting to hear back from the browser vendors about how > easily the subtitles could be kept with the video in a way that survives > reuse in other contexts. Incidentally, I'd be interested in such information about H.264. I wonder how easy it will be for example with QuickTime or mp4 to encapsulate srt on-the-fly inside a browser. Regards, Silvia.
Received on Wednesday, 15 July 2009 22:58:30 UTC