Re: Encoders (Re: Getting rid of SDP) from Harald Alvestrand on 2018-03-13 (public-webrtc@w3.org from March 2018)

From: Harald Alvestrand <harald@alvestrand.no>
Date: Tue, 13 Mar 2018 23:37:48 +0100
To: Cullen Jennings <fluffy@iii.ca>
Cc: public-webrtc@w3.org
Message-ID: <39b88fd6-b27c-67ff-0dca-4b9cdaa2c374@alvestrand.no>

On 03/13/2018 11:21 PM, Cullen Jennings wrote:
>
>
>> On Mar 6, 2018, at 12:28 AM, Harald Alvestrand <harald@alvestrand.no
>> <mailto:harald@alvestrand.no>> wrote:
>>
>> On 03/06/2018 12:10 AM, Peter Thatcher wrote:
>>>
>>>
>>> On Mon, Mar 5, 2018 at 3:06 PM Sergio Garcia Murillo
>>> <sergio.garcia.murillo@gmail.com
>>> <mailto:sergio.garcia.murillo@gmail.com>> wrote:
>>>
>>>     More flexibility, more work, more bug surface.. ;)
>>>
>>>     Anyway, I am not particularly against having access to RTP
>>>     packets from the encoders,
>>>
>>>
>>> Encoders do not emit RTP packets.  They emit encoded video frames. 
>>> Those are then packetized into RTP packets.  I'm suggesting the
>>> JS/wasm have access to the encoded frame before the packetization. 
>>
>>
>> Actually, encoders usually take raw framebuffers (4:2:0, 4:4:4 or
>> other formats) + metadata and emit video frames + metadata. It may be
>> crucially important to get a handle on what the metadata looks like,
>> in order to make sure we are able to transport not just the bytes of
>> the frame, but the metadata too.
>>
>> Metadata includes things like timing information (carried through the
>> encoding process), interframe dependencies (an output from the
>> encoding process) and preferences for encoding choices (input to the
>> encoding process).
>
> Good point - so to list some of theses 
>
> list of what metadata video encoders produce
> * if it is a reference frame or not
also what reference frames are needed for decoding the frame (typically
IDing a golden frame, a keyframe or a previous frame)
> * resolution
> * frame-rate ?
I don't think encoders produce framerate, but they definitely produce
the intended time interval between this frame and the previous frame.
> * capture time of frame
are you thinking SMTPE timecodes (absolute capture time, possibly far in
the past for recorded media) or RTP-clock-style "timestamp that can be
used for relative positioning in time"?
>
> list of what metadata video encoders needs.
> * capture timestamp
> * source and target resolution
> * source and target frame-rate
> * target bitrate
> * max bitrate
> * max pixel rate

the 3 last are examples of what feeds into a frame size control
algorithm; this may attempt to go for constant bitrate, bounded bitrate
with attempted fixed quality when below the bitrate, best subjective
impression (more bits for the action scenes) or other targets.

People also want to control things like:
- QP value ranges
- Motion / sharpness tradeoff (the famous "content-hint")
- SVC and simulcast controls

>
> Is that type of thing you were thinking about? What needs to be added.

Yes, it's some of the things I'm thinking about.

>
>
>

-- 
Surveillance is pervasive. Go Dark.

Received on Tuesday, 13 March 2018 22:38:38 UTC