- From: Harald Alvestrand <harald@alvestrand.no>
- Date: Wed, 30 May 2018 14:10:15 +0200
- To: "Stojiljkovic, Aleksandar" <aleksandar.stojiljkovic@intel.com>, "public-webrtc@w3.org" <public-webrtc@w3.org>
Den 30. mai 2018 13:42, skrev Stojiljkovic, Aleksandar: >> Interface Buffer ... Long long timestamp; // for start of buffer. Definition TBD. > > Maybe DOMHighResTimeStamp > <https://www.w3.org/TR/hr-time/#sec-domhighrestimestamp>, to enable sync > with other events, e.g. Sensor interface > <https://www.w3.org/TR/generic-sensor/#the-sensor-interface>. DOMHighResTimeStamp is convenient for absolute times, when it's clear what it's related to (time the light hit the camera sensor?). For playback of stored media, we might want to use a clock relative to the start of the media; I know there are various well known apps for doing things like time-stretching of videos - I'm not sure how that would be representable in an API. > > >> Enum VideoBufferFormat { // alternate - separate enums for audio and video > “i420-video”, > “i444-video”, “yuv-video”, > } > Suggestions: > - add enums for other formats, e.g. YUY2, UYVY, MJPG..., > - add y16-video for 16-bit single plane infrared/D16 depth capture. > > Why not using fourcc codes? Fourcc codes are probably a good idea. Do you have a good reference definition for them? > > Kind Regards, > Aleksandar > > > ------------------------------------------------------------------------ > *From:* Harald Alvestrand [harald@alvestrand.no] > *Sent:* Tuesday, May 29, 2018 9:29 AM > *To:* public-webrtc@w3.org > *Subject:* Raw data APIs - 2 - Access to unencoded data > > > *Proposal for Raw Data* > Extend MediaStreamTrack. > Let the following execute: > > track = new MediaStreamTrack(); > track.injectData(buffer).then((buffer) => recycleBuffer); > track.extractData(buffer).then((buffer) => processBuffer); > > The buffers consumed and produced should be modeled after the C++ API > for injecting frames (for video) or samples (for audio). > For integration with other systems (in particular WebAssembly), it’s > important that buffers be provided by the application - this allows > copying to be minimized without risking security. > It’s also important that the ownership of buffers is well defined - that > we have clear demarcation points where buffers are owned by the > MediaStreamTrack and its customers, and when they are owned by the > application for refilling. A promise-based interface seems good for this > type of operation. > Alternatively, a Streams-based interface can be used, such as the one > _proposed earlier_ > <https://github.com/yellowdoge/streams-mediastreamtrack>. This would > still be defined in terms of the buffer construct below, but would use > Streams, and therefore require a data copy. > > Partial interface MediaStreamTrack { > promise<Buffer> insertData(buffer); // will fail if track is connected > to a source > promise<Buffer> extractData(buffer); // will fail if track is > connected to a sink > } > > The buffers should be structures, not raw buffers. For instance: > > Interface Buffer { > DOMString kind; // video, audio, encoded-video, encoded-audio > Long long timestamp; // for start of buffer. Definition TBD. > BufferFormat format; > ByteArray buffer; // raw bytes, to be cast into appropriate format. > // need to study WebAsm linkages to get > this processing efficient. > } > > Interface AudioDataBuffer: Buffer { > // The name AudioBuffer is already used by WebAudio. This buffer has > the same > // properties, but can be used with multiple audio data formats. > AudioBufferFormat format; > float? sampleRate; // only valid for audio > int? Channels; // only valid for audio > } > > Enum AudioBufferFormat { > “l16-audio”, // 16 bit integer samples > “f32-audio”, // 32-bit floating point samples > } > > Interface VideoBuffer : Buffer { > VideoBufferFormat format; > Int width; > Int height; > DOMString rotation; > } > > Enum VideoBufferFormat { // alternate - separate enums for audio and video > “i420-video”, > “i444-video”, “yuv-video”, > } > > One important aspect of such an interface is what happens if congestion > happens - if insertFrame() is called more frequently than the downstream > can consume, or if extractData() is called on a cadence that is slower > than once every frame produced at source. > In both these cases, for the raw image API, I think it is reasonable to > just drop frames. The consumer needs to be able to look at the > timestamps of the produced frames and do the Right Thing - raw frames > have no interdependencies, so dropping frames is OK. > > > > -- > Surveillance is pervasive. Go Dark. > > --------------------------------------------------------------------- > Intel Finland Oy > Registered Address: PL 281, 00181 Helsinki > Business Identity Code: 0357606 - 4 > Domiciled in Helsinki > > This e-mail and any attachments may contain confidential material for > the sole use of the intended recipient(s). Any review or distribution > by others is strictly prohibited. If you are not the intended > recipient, please contact the sender and delete all copies. >
Received on Wednesday, 30 May 2018 12:10:51 UTC