- From: Philip Jägenstedt <philipj@opera.com>
- Date: Fri, 22 Oct 2010 10:19:41 +0200
On Tue, 19 Oct 2010 22:35:50 +0200, Silvia Pfeiffer <silviapfeiffer1 at gmail.com> wrote: > On Tue, Sep 14, 2010 at 7:49 PM, Philip J?genstedt <philipj at opera.com> > wrote: >> On Tue, 14 Sep 2010 10:30:03 +0200, Simon Pieters <simonp at opera.com> >> wrote: >> >>> On Tue, 14 Sep 2010 10:11:16 +0200, Philip J?genstedt >>> <philipj at opera.com> >>> wrote: >>> >>>> The point of a header is that browsers can identify WebSRT files and >>>> not >>>> keep parsing through a 100GB movie file, >>> >>> I don't think we should break SRT compat for this. I don't think this >>> is a >>> problem at all. We already have this situation elsewhere, e.g. what if >>> you >>> do <link rel=stylesheet href=movie.webm>? >>> >>> If it really turns out to be a problem you could just apply the >>> hardware >>> limitations clause and abort parsing if you haven't found any cues >>> after >>> parsing X bytes or whatever. >>> >>> In any case, the spec currently requires text/srt (or other supported >>> subtitle format MIME type) for <track>, so a movie file would be >>> rejected >>> based on the MIME type per spec (see step 4 in >>> #sourcing-out-of-band-timed-tracks). >>> >> >> Well, I was hoping to sidestep the issue of MIME types and file >> extensions >> by always ignoring them. Last I checked Apache doesn't have a default >> mapping for .srt, so everyone using <track> would have to add it >> themselves. >> >> About metadata, I noticed that there's a voice called <credit>... > > I think that's only for the credits at the start or end of a movie. > > > > Anyway: I'm trying to summarize the changes that were discussed this > far to WebSRT. I think we have the following: > > * add a header to identify the kind of websrt file & the language > * add a means to add metadata as name-value pairs > > e.g. > WebSRT > language: en-US > author: Frank > date: 2010-09-20 > kind: subtitle > copyright: WGBH, 2010 > license: CC-BY-SA, http://creativecommons.org/licenses/by-sa/3.0/ What should happen when the language in <track srclang> doesn't match the language in the file itself? Also, why is kind needed in the file? > * add a means to add comments > > e.g. > // Lines starting with // are comments So far the web two comment syntaxes: <!-- SGML style --> and /* CSS style */, so if we need comments I think we should pick one of these. > And some changes on <track>: > * make @kind a required attribute Why was this? > * add @type for mime type identification as we allow more than just > WebSRT as external formats, e.g. TTML Having more than one format seems to complicate rendering. The WebSRT rendering rules tries to avoid overlap between cues from different tracks, but I don't see how that could work between different formats, unless all formats have basically the same model. It certainly wouldn't work with a fixed-layout format like TTML. In other words, can't this wait until some implementor has shown concrete interest in implementing more than one format? Anyway, I agree that at least a magic header like "WebSRT" is needed because of the horrors of legacy SRT parsing. Breaking SRT compat means that we can go back to requiring UTF-8 as the encoding. However, UTF-8 does complicate the magic header a bit due to the possibility of a BOM [1]. While it would be nice to forbid the use of a BOM, I expect we'd then see lots of frustration from authors who's editors automatically insert it... [1] http://en.wikipedia.org/wiki/Byte_order_mark#UTF-8 -- Philip J?genstedt Core Developer Opera Software
Received on Friday, 22 October 2010 01:19:41 UTC