RE: API proposals for raw data - 1 - Introduction from Stojiljkovic, Aleksandar on 2018-05-29 (public-webrtc@w3.org from May 2018)

From: Stojiljkovic, Aleksandar <aleksandar.stojiljkovic@intel.com>
Date: Tue, 29 May 2018 07:56:58 +0000
To: Harald Alvestrand <harald@alvestrand.no>, "public-webrtc@w3.org" <public-webrtc@w3.org>
Message-ID: <7A0FAB90EDAE304EA98E76ACB132D51233253629@IRSMSX103.ger.corp.intel.com>

> Use Canvas to paint from a <video> tag (example)

It is important to mention WebGL usage here - uploading video frame to texture and further operations with it. As an example, the link to a section<https://www.w3.org/TR/mediacapture-depth/#upload-video-frame-to-webgl-texture> in mediacapture-depth specification, with examples<https://www.w3.org/TR/mediacapture-depth/#dfn-upload-to-float-texture>.

WebRTC + WebGL is currently the only way to handle (access raw data) 16-bit and HDR video with no data loss - referring here to e.g. computer vision use cases, processing the frames in JavaScript or WebGL shaders.
Due to WebGL API constraints (no GL_RGBA16_EXT support), application developer needs to use float (or half float) array. However, with raw data access API designed here, we could handle various formats - as I see already planned.

Kind Regards,
Aleksandar


________________________________
From: Harald Alvestrand [harald@alvestrand.no]
Sent: Tuesday, May 29, 2018 9:27 AM
To: public-webrtc@w3.org
Subject: API proposals for raw data - 1 - Introduction


This is part 1 of a multipart posting, laying out various API ideas in response to the need for "raw data access".

Problem:


Many creative usages of WebRTC audio and video data require direct access to the data - either in raw form or in encoded form.


All of these things are possible today, but they are not simple.


The current interfaces to get raw samples are:

  *   Use WebAudio to generate samples as arrays of floats in WebAudio’s clock (example<https://webrtc.github.io/samples/src/content/getusermedia/volume/>)

  *   Use Canvas to paint from a <video> tag (example<https://webrtc.github..io/samples/src/content/getusermedia/canvas/>)

For injecting raw data we have:

  *   Use WebAudio to generate a MediaStreamTrack from raw audio data

  *   Use canvas.captureStream() to capture frames from a Canvas (example<https://webrtc.github.io/samples/src/content/capture/canvas-video/>)

For getting encoded data we have:

  *   MediaRecorder (example<https://webrtc.github.io/samples/src/content/getusermedia/record/>)

For injecting encoded data we have:

  *   MSE feeding a <video> tag, and then generating a MediaStreamTrack from the <video> tag



All of these mechanisms are somewhat convoluted, they impose restrictions on the form of data that can be delivered (often imposing transcoding costs), and require the application writer to be familiar with multiple shapes of API surface.


A simpler method is desirable.


--
Surveillance is pervasive. Go Dark.

---------------------------------------------------------------------
Intel Finland Oy
Registered Address: PL 281, 00181 Helsinki 
Business Identity Code: 0357606 - 4 
Domiciled in Helsinki 

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

Received on Tuesday, 29 May 2018 08:00:02 UTC