- From: Jack Jansen <Jack.Jansen@cwi.nl>
- Date: Sun, 28 Jun 2009 22:35:20 +0200
- To: Silvia Pfeiffer <silviapfeiffer1@gmail.com>
- Cc: Media Fragment <public-media-fragment@w3.org>
- Message-Id: <1C6C0C6E-3D59-4DD4-84A6-F1869C65F471@cwi.nl>
On 27-Jun-2009, at 08:04 , Silvia Pfeiffer wrote: > I've reviewed the document on the plane to New York (to the Open Video > Conference) and am about to write a review blog post and a reply > email. I will give it as general feedback, not from our group. > > I think DFXP satisfies all the requirements we could have as a group > toward a caption / timed text format, namely: > * providing a language descriptor for the file (-> lang attribute in > tt element) > * providing means of directly accessing markers for time fragments (-> > id attribute on timed text paragraphs) > > The only thing maybe missing is to provide a track name for when the > DFXP file is included in a media file, but that is not as important > and can be done during the muxing-process (e.g. using skeleton in > Ogg). > > Anything else I forgot to make sure is available? One of the things I'm not sure about (and unfortunately I don't have the time to check out the whole document... 100+ pages for a subtitle format...) is their timing constructs. Originally (5 years ago) they started with a SMIL timing model. Then they toned it down. But: I have no idea where they stopped. A full SMIL timing model may be sub-optimal from a Media Fragments point of view. A SMIL document cannot be indexed uniquely with, say, #t=10,20, because what happens at t=10 may not be completely specified (being dependent on user input, for example). And even the underlying document has fully specified timing (i.e. no event-based timing or media-based timing at all) selecting the portion that corresponds to #t=10,20 would require a full SMIL parser and execution engine. For SMIL this is to be expected: asking for #t=10,20 for a time-based composition document is a bit like asking for #xywh=10,10,100,100 of an SVG document, it would non-trivial for simple cases and impossible in the general case. When we (we==the SYMM group) designed SMILText we've tried to be very careful to design a format that is linear in time, and with timing constructs only at the outer level. This was done specifically so that selecting the portion corresponding to #t=10,20 would not require a full SMILText parser, but simply inspecting the toplevel elements for their begin and next attributes and either copying or skipping the whole element and its substructure. (and giving up if event-based timing was used). I'm pretty sure the same is true for CMML, it should also be possible to extract a temporal subsection without a full-blown CMML execution engine (but correct me if I'm overly optimistic here). I'm not convinced it's possible to extract a temporal fragment of a DFXP file without a full execution engine. But: maybe they've defined profiles or something in the mean time that allows this (fragmenting is not the only use case that would benefit from a subset that is linear and easily parsed and such). -- Jack Jansen, <Jack.Jansen@cwi.nl>, http://www.cwi.nl/~jack If I can't dance I don't want to be part of your revolution -- Emma Goldman
Attachments
- application/pkcs7-signature attachment: smime.p7s
Received on Sunday, 28 June 2009 20:36:02 UTC