- From: Harald Alvestrand <harald@alvestrand.no>
- Date: Tue, 29 May 2018 08:29:14 +0200
- To: "public-webrtc@w3.org" <public-webrtc@w3.org>
- Message-ID: <def2aa3b-b23e-1f88-f192-2cee59e39bb8@alvestrand.no>
**
*Proposal for Raw Data*
*
Extend MediaStreamTrack.
Let the following execute:
track = new MediaStreamTrack();
track.injectData(buffer).then((buffer) => recycleBuffer);
track.extractData(buffer).then((buffer) => processBuffer);
The buffers consumed and produced should be modeled after the C++ API
for injecting frames (for video) or samples (for audio).
For integration with other systems (in particular WebAssembly), it’s
important that buffers be provided by the application - this allows
copying to be minimized without risking security.
It’s also important that the ownership of buffers is well defined - that
we have clear demarcation points where buffers are owned by the
MediaStreamTrack and its customers, and when they are owned by the
application for refilling. A promise-based interface seems good for this
type of operation.
Alternatively, a Streams-based interface can be used, such as the one
proposed earlier
<https://github.com/yellowdoge/streams-mediastreamtrack>. This would
still be defined in terms of the buffer construct below, but would use
Streams, and therefore require a data copy.
Partial interface MediaStreamTrack {
promise<Buffer> insertData(buffer); // will fail if track is connected
to a source
promise<Buffer> extractData(buffer); // will fail if track is
connected to a sink
}
The buffers should be structures, not raw buffers. For instance:
Interface Buffer {
DOMString kind; // video, audio, encoded-video, encoded-audio
Long long timestamp; // for start of buffer. Definition TBD.
BufferFormat format;
ByteArray buffer; // raw bytes, to be cast into appropriate format.
// need to study WebAsm linkages to get
this processing efficient.
}
Interface AudioDataBuffer: Buffer {
// The name AudioBuffer is already used by WebAudio. This buffer has
the same
// properties, but can be used with multiple audio data formats.
AudioBufferFormat format;
float? sampleRate; // only valid for audio
int? Channels; // only valid for audio
}
Enum AudioBufferFormat {
“l16-audio”, // 16 bit integer samples
“f32-audio”, // 32-bit floating point samples
}
Interface VideoBuffer : Buffer {
VideoBufferFormat format;
Int width;
Int height;
DOMString rotation;
}
Enum VideoBufferFormat { // alternate - separate enums for audio and video
“i420-video”,
“i444-video”, “yuv-video”,
}
One important aspect of such an interface is what happens if congestion
happens - if insertFrame() is called more frequently than the downstream
can consume, or if extractData() is called on a cadence that is slower
than once every frame produced at source.
In both these cases, for the raw image API, I think it is reasonable to
just drop frames. The consumer needs to be able to look at the
timestamps of the produced frames and do the Right Thing - raw frames
have no interdependencies, so dropping frames is OK.
*
--
Surveillance is pervasive. Go Dark.
Received on Tuesday, 29 May 2018 06:29:40 UTC