Raw data APIs - 2 - Access to unencoded data

**


    *Proposal for Raw Data*

*

Extend MediaStreamTrack.

Let the following execute:


track = new MediaStreamTrack();

track.injectData(buffer).then((buffer) => recycleBuffer);

track.extractData(buffer).then((buffer) => processBuffer);


The buffers consumed and produced should be modeled after the C++ API
for injecting frames (for video) or samples (for audio).

For integration with other systems (in particular WebAssembly), it’s
important that buffers be provided by the application - this allows
copying to be minimized without risking security.

It’s also important that the ownership of buffers is well defined - that
we have clear demarcation points where buffers are owned by the
MediaStreamTrack and its customers, and when they are owned by the
application for refilling. A promise-based interface seems good for this
type of operation.

Alternatively, a Streams-based interface can be used, such as the one
proposed earlier
<https://github.com/yellowdoge/streams-mediastreamtrack>. This would
still be defined in terms of the buffer construct below, but would use
Streams, and therefore require a data copy.


Partial interface MediaStreamTrack {

promise<Buffer> insertData(buffer);  // will fail if track is connected
to a source

          promise<Buffer> extractData(buffer);  // will fail if track is
connected to a sink

}


The buffers should be structures, not raw buffers. For instance:


Interface Buffer {

    DOMString kind;  // video, audio, encoded-video, encoded-audio

    Long long timestamp; // for start of buffer. Definition TBD.

    BufferFormat format;

    ByteArray buffer;  // raw bytes, to be cast into appropriate format.

                                 // need to study WebAsm linkages to get
this processing efficient.

}


Interface AudioDataBuffer: Buffer {

    // The name AudioBuffer is already used by WebAudio. This buffer has
the same

    // properties, but can be used with multiple audio data formats.

    AudioBufferFormat format;

    float? sampleRate; // only valid for audio

    int? Channels;  // only valid for audio

}


Enum AudioBufferFormat {

  “l16-audio”,  // 16 bit integer samples

  “f32-audio”,  // 32-bit floating point samples

}


Interface VideoBuffer : Buffer {

   VideoBufferFormat format;

   Int width;

   Int height;

   DOMString rotation;

}


Enum VideoBufferFormat {  // alternate - separate enums for audio and video

  “i420-video”,

  “i444-video”,  “yuv-video”,

}


One important aspect of such an interface is what happens if congestion
happens - if insertFrame() is called more frequently than the downstream
can consume, or if extractData() is called on a cadence that is slower
than once every frame produced at source.

In both these cases, for the raw image API, I think it is reasonable to
just drop frames. The consumer needs to be able to look at the
timestamps of the produced frames and do the Right Thing - raw frames
have no interdependencies, so dropping frames is OK.

*

-- 
Surveillance is pervasive. Go Dark.

Received on Tuesday, 29 May 2018 06:29:40 UTC