W3C home > Mailing lists > Public > public-html@w3.org > February 2011

Re: Tech Discussions on the Multitrack Media (issue-152)

From: Mark Watson <watsonm@netflix.com>
Date: Thu, 24 Feb 2011 18:05:27 -0800
To: David Singer <singer@apple.com>
CC: Silvia Pfeiffer <silviapfeiffer1@gmail.com>, Bob Lund <B.Lund@cablelabs.com>, "public-html@w3.org" <public-html@w3.org>
Message-ID: <9A437B5C-5AEC-415C-A443-0423DC64C4A1@netflix.com>

On Feb 24, 2011, at 5:36 PM, David Singer wrote:

> 
> On Feb 24, 2011, at 17:29 , Silvia Pfeiffer wrote:
> 
>> 
>> 
>> When you talk about videos that are slide shows, are you actually
>> talking about videos or about a sequence of images (photos) that are
>> also sparse along the timeline, so not really "moving images"? If they
>> are encoded as video, it would be impossible to distinguish the data
>> as a sequence of photos. The other case - basically an "image track"
>> is not something we've ever discussed before and is not something that
>> all containers have formats for FAIK.
> 
> I mean a video track where the frame rate is like 1 frame every 10 seconds.  Many, or even most, containers can express this, I think.
> 
>> As for chapter images - I am not sure how they are encoded in
>> QuickTime/MPEG, so if you know, please share. I would have thought
>> that a text track with image urls could be sufficient for this.
> 
> 
> An obvious choice is as above, using an I-frame-only coding (e.g. JPEG).

As it happens, at Netflix, we use something exactly like that today, although the container format we use is not a standard one. We use a single file containing all the images, rather than a list of URLs, to avoid a lot of small requests. In our case the images are more like thumbnails so the complete movie worth of images is not that big.

I think, though, that the question to be asked in each case is whether it would make sense for rendering of the media to ever be handed to the web application vs having the rendering handled by the media player. I'm not sure there is anything useful that the web application could ever do with audio, but for things with a visual presentation there is always more flexibility in rendering at the web application level than in the media player. Passing every video frame to the web application layer might often be impractical, though, depending in the frame rate and other implementation issues. But the ability to get specific images, or all the images of a very-low-frame-rate version, could enable interesting navigation UIs.

This would suggest that the API should support passing samples with a visual representation, be they text or images, along with their timestamps, but that whether this is supported for a given (video) track, and perhaps whether all frames are available or just the I-Frames, is up to the implementation. Implementors can take a view on the kinds of frame rate/encoding/container format for which they support this. UIs would have to be designed to make do without those things on platforms which did not support them.

...Mark

> 
> David Singer
> Multimedia and Software Standards, Apple Inc.
> 
> 
> 
Received on Friday, 25 February 2011 02:09:14 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 9 May 2012 00:17:22 GMT