- From: Anne van Kesteren <annevk@opera.com>
- Date: Fri, 16 Apr 2010 16:03:22 +0900
On Fri, 16 Apr 2010 15:49:38 +0900, Silvia Pfeiffer <silviapfeiffer1 at gmail.com> wrote: > On Fri, Apr 16, 2010 at 3:32 PM, Anne van Kesteren <annevk at opera.com> > wrote: >> A spec would also need to be written if we go for this new >> TTML-minus-certain-features-and-using-CSS-rather-than-XSL-FO format. >> That would probably be worse since we would be forking an existing >> format in an incompatible way. > > No forking - just specifying a mapping of the things that are > supportable. And yes: that needs to be written too. Sounds like a fork to me. E.g. if we don't want a new parser for <color> values (and we really don't) and use the CSS parser things would be different. >>> Also, if we are introducing HTML markup inside SRT time cues, then it >>> would make sense to turn the complete SRT file into markup, not just >>> the part inside the time cue. Further, SRT has no way to specify which >>> language it is written in and further such general mechanisms that >>> already exist for HTML. >> >> What general mechanisms are needed exactly? Why is language needed? >> Isn't that already specified by the embedder? > > I guess the problem is more with char sets. > For HTML pages and other Web content, there is typically information > inside the resource that tells you what character set the document is > written in. E.g. HTML pages have > <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">. > Such functionality is not available for SRT, so it is impossible for a > browser to tell what charset to use to render the content in. It would simply always be UTF-8, much like text/cache-manifest and text/event-stream. > And yes, we have made an adjustment in the Media Associations spec for > <track> to contain a hint on what mime type and charset the external > document is specified in. But that is only a bad fix of SRT's problem. > It should be available inside the file so that any application can use > the SRT file without requiring additional information. I guess. > The extended SRT file will barely have anything in common with the > original ones. There is more HTML markup to learn than SRT markup. And > having HTML markup encapsulated in a non-html file is just weird. > Also, the numbering through of the captions is honestly not very > useful. Yeah, maybe you're right. > (3) TTML file: (no hyperlinks, no images - just for comparison) > > --- > <?xml version="1.0" encoding="utf-8"?> > <tt xml:lang="en_us" xmlns="http://www.w3.org/ns/ttml"> > <head> > <styling> > <style xml:id="left-align" > tts:fontFamily="proportionalSansSerif" > tts:textAlign="left" > /> > <style xml:id="right-align" > tts:fontFamily="monospaceSerif" > tts:textAlign="right" > /> > <style xml:id="speaker" > tts:fontFamily="monospaceSerif" > tts:textAlign="left" > tts:fontWeight="bold" > /> > </styling> > <layout> > <region xml:id="subtitleArea" > tts:extent="560px 62px" > tts:padding="5px 3px" > /> > </layout> > </head> > <body region="subtitleArea"> > <div> > <p style="left-align" begin="0.15s" end="0.17s 951ms"> > <div style="speaker">Proog:</div> > <div tts:color="green">At the <span > tts:fontStyle="italic">left</span> we can see...</div> > </p> > <p style="right-align" begin="0.18s 166ms" end="0.20s 83ms"> > <div tts:color="green">At the right we can see the...</div> > </p> > </div> > </body> > </tt> > --- That this sample file has namespace errors and is therefore not well-formed is part of the reason I think TTML is a very bad idea. (Besides giving a new meaning to a bunch of HTML-like elements.) > (4) possibly new xml/html-ish file: > > [...] > > I think (4) is preferable over (2) for the more consistent markup and > actual xml parsability. I don't buy the XML parser argument (as a) an XML parser is not much simpler because of the internal subset and b) it comes with namespaces), but I can see how a new format might be somewhat better-looking than something based on SRT. -- Anne van Kesteren http://annevankesteren.nl/
Received on Friday, 16 April 2010 00:03:22 UTC