- From: Harald Alvestrand <harald@alvestrand.no>
- Date: Tue, 29 May 2018 08:29:14 +0200
- To: "public-webrtc@w3.org" <public-webrtc@w3.org>
- Message-ID: <def2aa3b-b23e-1f88-f192-2cee59e39bb8@alvestrand.no>
** *Proposal for Raw Data* * Extend MediaStreamTrack. Let the following execute: track = new MediaStreamTrack(); track.injectData(buffer).then((buffer) => recycleBuffer); track.extractData(buffer).then((buffer) => processBuffer); The buffers consumed and produced should be modeled after the C++ API for injecting frames (for video) or samples (for audio). For integration with other systems (in particular WebAssembly), it’s important that buffers be provided by the application - this allows copying to be minimized without risking security. It’s also important that the ownership of buffers is well defined - that we have clear demarcation points where buffers are owned by the MediaStreamTrack and its customers, and when they are owned by the application for refilling. A promise-based interface seems good for this type of operation. Alternatively, a Streams-based interface can be used, such as the one proposed earlier <https://github.com/yellowdoge/streams-mediastreamtrack>. This would still be defined in terms of the buffer construct below, but would use Streams, and therefore require a data copy. Partial interface MediaStreamTrack { promise<Buffer> insertData(buffer); // will fail if track is connected to a source promise<Buffer> extractData(buffer); // will fail if track is connected to a sink } The buffers should be structures, not raw buffers. For instance: Interface Buffer { DOMString kind; // video, audio, encoded-video, encoded-audio Long long timestamp; // for start of buffer. Definition TBD. BufferFormat format; ByteArray buffer; // raw bytes, to be cast into appropriate format. // need to study WebAsm linkages to get this processing efficient. } Interface AudioDataBuffer: Buffer { // The name AudioBuffer is already used by WebAudio. This buffer has the same // properties, but can be used with multiple audio data formats. AudioBufferFormat format; float? sampleRate; // only valid for audio int? Channels; // only valid for audio } Enum AudioBufferFormat { “l16-audio”, // 16 bit integer samples “f32-audio”, // 32-bit floating point samples } Interface VideoBuffer : Buffer { VideoBufferFormat format; Int width; Int height; DOMString rotation; } Enum VideoBufferFormat { // alternate - separate enums for audio and video “i420-video”, “i444-video”, “yuv-video”, } One important aspect of such an interface is what happens if congestion happens - if insertFrame() is called more frequently than the downstream can consume, or if extractData() is called on a cadence that is slower than once every frame produced at source. In both these cases, for the raw image API, I think it is reasonable to just drop frames. The consumer needs to be able to look at the timestamps of the produced frames and do the Right Thing - raw frames have no interdependencies, so dropping frames is OK. * -- Surveillance is pervasive. Go Dark.
Received on Tuesday, 29 May 2018 06:29:40 UTC