W3C home > Mailing lists > Public > public-webrtc@w3.org > May 2018

Raw data APIs - 2 - Access to unencoded data

From: Harald Alvestrand <harald@alvestrand.no>
Date: Tue, 29 May 2018 08:29:14 +0200
To: "public-webrtc@w3.org" <public-webrtc@w3.org>
Message-ID: <def2aa3b-b23e-1f88-f192-2cee59e39bb8@alvestrand.no>

    *Proposal for Raw Data*


Extend MediaStreamTrack.

Let the following execute:

track = new MediaStreamTrack();

track.injectData(buffer).then((buffer) => recycleBuffer);

track.extractData(buffer).then((buffer) => processBuffer);

The buffers consumed and produced should be modeled after the C++ API
for injecting frames (for video) or samples (for audio).

For integration with other systems (in particular WebAssembly), it’s
important that buffers be provided by the application - this allows
copying to be minimized without risking security.

It’s also important that the ownership of buffers is well defined - that
we have clear demarcation points where buffers are owned by the
MediaStreamTrack and its customers, and when they are owned by the
application for refilling. A promise-based interface seems good for this
type of operation.

Alternatively, a Streams-based interface can be used, such as the one
proposed earlier
<https://github.com/yellowdoge/streams-mediastreamtrack>. This would
still be defined in terms of the buffer construct below, but would use
Streams, and therefore require a data copy.

Partial interface MediaStreamTrack {

promise<Buffer> insertData(buffer);  // will fail if track is connected
to a source

          promise<Buffer> extractData(buffer);  // will fail if track is
connected to a sink


The buffers should be structures, not raw buffers. For instance:

Interface Buffer {

    DOMString kind;  // video, audio, encoded-video, encoded-audio

    Long long timestamp; // for start of buffer. Definition TBD.

    BufferFormat format;

    ByteArray buffer;  // raw bytes, to be cast into appropriate format.

                                 // need to study WebAsm linkages to get
this processing efficient.


Interface AudioDataBuffer: Buffer {

    // The name AudioBuffer is already used by WebAudio. This buffer has
the same

    // properties, but can be used with multiple audio data formats.

    AudioBufferFormat format;

    float? sampleRate; // only valid for audio

    int? Channels;  // only valid for audio


Enum AudioBufferFormat {

  “l16-audio”,  // 16 bit integer samples

  “f32-audio”,  // 32-bit floating point samples


Interface VideoBuffer : Buffer {

   VideoBufferFormat format;

   Int width;

   Int height;

   DOMString rotation;


Enum VideoBufferFormat {  // alternate - separate enums for audio and video


  “i444-video”,  “yuv-video”,


One important aspect of such an interface is what happens if congestion
happens - if insertFrame() is called more frequently than the downstream
can consume, or if extractData() is called on a cadence that is slower
than once every frame produced at source.

In both these cases, for the raw image API, I think it is reasonable to
just drop frames. The consumer needs to be able to look at the
timestamps of the produced frames and do the Right Thing - raw frames
have no interdependencies, so dropping frames is OK.


Surveillance is pervasive. Go Dark.
Received on Tuesday, 29 May 2018 06:29:40 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 19:18:41 UTC