Re: [whatwg] VIDEO and pitchAdjustment from David Singer on 2015-09-01 (public-whatwg-archive@w3.org from September 2015)

From: David Singer <singer@apple.com>
Date: Tue, 01 Sep 2015 10:30:01 -0700
To: robert@ocallahan.org
Cc: WHAT Working Group <whatwg@lists.whatwg.org>, Kevin Marks <kevinmarks@gmail.com>
Message-id: <A70B971C-1A66-4415-B28C-8A939D541BC2@apple.com>

> On Sep 1, 2015, at 4:03 , Robert O'Callahan <robert@ocallahan.org> wrote:
> 
> On Tue, Sep 1, 2015 at 8:02 PM, Kevin Marks <kevinmarks@gmail.com> wrote:
> 
>> QuickTime supports full variable speed playback and has done for well over
>> a decade. With bidirectionally predicted frames you need a fair few buffers
>> anyway, so generalising to full variable wait is easier than posters above
>> claim - you need to work a GOP at a time, but memory buffering isn't the
>> big issue these days.
>> 
> 
> "GOP”?

Group of Pictures.  Video-speak for the run between random access points.

> 
> How about a hard but realistic (IMHO) case: 4K video (4096 x 2160), 25 fps,
> keyframe every 10s. Storing all those frames takes 250 x 4096 x 2160 x 2
> bytes = 4.32 GiB. Reading back those frames would kill performance so that
> all has to stay in VRAM. I respectfully deny that in such a case, memory
> buffering "isn't a big issue”.

well, 10s is a pretty long random access interval.

> 
> Now that I think about it, I guess there are more complicated strategies
> available that would reduce memory usage at the expense of repeated
> decoding.

which indeed QuickTime implemented around 10 years ago.

> E.g. in a first pass, decode forward and store every Nth frame.
> Then as you play backwards you need only redecode N-1 intermediate frames
> at time. I don't know whether HW decoder interfaces would actually let you
> implement that though...
> 
> What QuickTime got right was having a ToC approach to video so being able
>> to seek rapidly was possible without thrashing , whereas the stream
>> oriented approaches we are stuck with no wean knowing which bit of the file
>> to read to get the previous GOP is the hard part.
>> 
> 
> I don't understand. Can you explain this in more detail?

The movie file structure (and hence MP4) has a table-of-contents approach to file structure; each frame has its timestamps, file location, size, and keyframe-nature stored in compact tables in the head of the file.  This makes trick modes and so on easier; you’re not reading the actual video to seek for a keyframe, and so on.

David Singer
Manager, Software Standards, Apple Inc.

Received on Tuesday, 1 September 2015 17:30:32 UTC