Re: Additional requirement - audio-only communication from Matthew Kaufman on 2011-09-06 (public-webrtc@w3.org from September 2011)

From: Matthew Kaufman <matthew.kaufman@skype.net>
Date: Mon, 05 Sep 2011 21:19:15 -0700
To: Eric Rescorla <ekr@rtfm.com>
CC: Harald Alvestrand <harald@alvestrand.no>, Stefan Håkansson LK <stefan.lk.hakansson@ericsson.com>, "public-webrtc@w3.org" <public-webrtc@w3.org>
Message-ID: <4E659F43.30407@skype.net>

On 8/25/2011 9:02 AM, Eric Rescorla wrote:
> Matthew,
>
> Do you think you could sketch out (or point to) what a sample app
> would look like using
> this API style?
>
>

Sorry for the delay on this one...

I'm basing these comments on the W3C webrtc draft spec.

1. There must be a way to get the capabilities without committing to use 
(and thus prompting the user for permission to use) resources.

We need something like:
   getUserMediaCapabilities(options, successCallback);

Example:
  A web site that wished to prompt the user to start a call with a 
customer service representative *if* the user is equipped to do so might 
call the API as follows:

   navigator.getUserMediaCapabilities('audio', gotAudioCapabilities);  
// does NOT cause the user to be prompted for microphone permission
   function gotAudioCapabilities(capabilities) {
      // user has an audio capture device
      // ...now check 'capabilities' object to see if G.729 or G.711 is 
supported for calling our call center directly... if yes, prompt user, 
otherwise ignore that they have audio capture
   }

2. Encoding choices should be exposed by allowing encoder media streams 
to exist, rather than putting the encoding in the PeerConnection (and 
controlling the encoding using SDP to/from that PeerConnection object)

Example:
   navigator.getUserMedia('video user', gotStream);  // DOES cause the 
user to be prompted for camera permission
   function gotStream(camera) {
     encodedStream = new CompressedMediaStream(camera, 'H.264');
          // or alternatively, encodedStream = new 
CompressedMediaStreamH264(camera); -- different subclasses for each 
encoder. makes it harder to combine with a sensible capabilities system 
however, as you really want strings saying what you have and then a 
single constructor that works with any in that list
     encodedStream.frameRate = 12;
     // whatever else, possibly including 
encodedStream.codecSpecificParams(stringBlobOfSettings);
     // note that setting this here right after construction works 
particularly well because we can ensure that the encoder doesn't start 
until control leaves this function

    // example continues below

Note that I'm not yet decided on how exactly we should do audio + 
video... I think the answer is that they should stay separate all the 
way to the addStream on the PeerConnection, but there's other 
alternatives available where we have the audio encoder do audio 
compression but pass video untouched, and the video encoder do video 
compression but pass audio untouched so we can treat the stream as a 
single object with both.

3. RTP choices should be exposed by allowing RTP media streams to exist, 
rather than putting the RTP parameter setting in the PeerConnection (and 
controlling the encoding using SDP to/from that PeerConnection object)

Example from above continues:

     rtpStream = new RTPMediaStream(encodedStream);
     rtpStream.payloadType = 131;
     rtpStream.ssrc = 15;

     // note that if we tried to combine A+V into a single stream, we 
need a more expressive (and yet uglier) way to set the PT and SSRC for each

     // and we created our peerConnection earlier...

    peerConnection.addStream(rtpStream);  // the only other alternative 
to this part of my proposal is to change addStream to take the PT and 
SSRC as parameters, but that's not nearly as clean

4. PeerConnection object should still have processSignalingMessage and 
onmessage callback, but the data sent via this channel should be 
restricted to ONLY the information necessary to successfully bring up 
the session using ICE.

---

Note that all of this still allows for passing SDP around if you want. 
You simply need to write Javascript to convert the capabilities into an 
offer and the answer back into explicit settings for the encoder(s) as 
well as the RTP bits. But more likely, you'd pass SDP around *only* for 
federation, and use something else (like just passing a JSON blob that 
came out of the configuration check right up to the web server to be 
handled there).

Matthew Kaufman

Received on Tuesday, 6 September 2011 04:20:20 UTC