Re: Raw data API - 3 - Encoded data from Harald Alvestrand on 2018-05-29 (public-webrtc@w3.org from May 2018)

From: Harald Alvestrand <harald@alvestrand.no>
Date: Tue, 29 May 2018 13:09:33 +0200
To: Sergio Garcia Murillo <sergio.garcia.murillo@gmail.com>, public-webrtc@w3.org
Message-ID: <a8a4d571-0e40-e6c7-0045-b8f3c6ebe55b@alvestrand.no>

Den 29. mai 2018 12:21, skrev Sergio Garcia Murillo:
> I fail to see what would be the benefits of adding the media frame API
> to the RTP objects, especially if we intend to provide a lower raw rtp API.

I don't think we'll specify or implement all of the proposals I've
tossed out. This is more in the spirit of "exploring the design space".

> 
> In order to be able to implement this API on the browser, the codec
> packetization of the encoded stream must be known by the browser.

Why? This particular API proposal was intended to avoid having that
property.

I agree that fitting it on the RTPSender doesn't seem like a Good Idea;
it may fit onto a GenericSender that is a superclass of RTPSender, or it
may be designed as a "mixin" API that we can apply to both RTPSender and
QUICSender (if such a beast exists), or "something else".

> Also,
> it is not possible to modify the frame, for example applying end to end
> media encryption, as packetization requires to have access to the raw
> data or add metadata for the very same reason.

Does it?
The H.264 packetization mode 0 and 1 packetizers (those are the ones
I've done code for recently) need to ensure that they know about frame
boundaries, and the frame needs to be encoded either with or without the
00 00 01 resync sequences, and there's numbering that has to be applied
correctly, but otherwise, I'm not aware that the packetizer needs to
know about the frame content.

> If the API would only alow to forward the frames from an encoder/decoder
> (or another media source/sink) to the rtp objects, I would prefer a
> higher level API that deals with streams and not individual packets.

We've been asked to provide APIs that allow us to deal with the
individual bytes. This is one option.
How would you like an interface to look like?

> 
> Best regards
> Sergio
> 
> On 29/05/2018 8:30, Harald Alvestrand wrote:
>>
>> **
>>
>>
>>     *Access to encoded streams*
>>
>> *
>>
>> A similar interface can be added to RTPSender and RTPReceiver,
>> respectively (and similar APIs for other transports, when defined).
>>
>>
>> Here the buffers would contain encoded video / audio data, and the
>> control block parameters would have to have enough information to let
>> the RTP packet headers be constructed - or, on the receiver, the info
>> from the RTP packet headers be represented.
>>
>>
>> Partial interface RTPSender {
>>
>>    promise<encodedBuffer> injectData(encodedBuffer);
>>
>> }
>>
>>
>> Partial interface RTPReceiver {
>>
>>    promise<encodedBuffer> extractData(encodedBuffer);
>>
>> }
>>
>>
>> Interface encodedBuffer : Buffer {
>>
>>    Long rtpTimestamp;
>>
>>    Long frameId;
>>
>>    sequence<long> dependsOnFrames;
>>
>>     // more fields TBD
>>
>> }   
>>
>>
>> On this interface, frames do have interdependencies, so dropping
>> packets is much more problematic. The “dependsOnFrames” member is
>> intended to help deciding on sensible handling - it would tell the
>> other side of the API that “if you dropped one of these frames, you
>> might as well drop this frame too”.
>>
>> *
>

Received on Tuesday, 29 May 2018 11:10:07 UTC