Re: Where to attach a DTMF API from Justin Uberti on 2011-11-30 (public-webrtc@w3.org from November 2011)

From: Justin Uberti <juberti@google.com>
Date: Tue, 29 Nov 2011 23:25:00 -0500
To: Harald Alvestrand <harald@alvestrand.no>
Cc: public-webrtc@w3.org
Message-ID: <CAOJ7v-1wpu_8WOCrJRO8VYSkSkaDMjZYLFXyS0JFmjh8bGjScw@mail.gmail.com>

On Tue, Nov 29, 2011 at 2:59 PM, Harald Alvestrand <harald@alvestrand.no>wrote:

> On 11/29/2011 10:41 AM, Neil Stratford wrote:
>
>> On 29/11/2011 08:48, Stefan Håkansson LK wrote:
>>
>>> I the mail referenced, sendDTMF is a method on MediaStreamTrack. I think
>>> the method should apply on PeerConnection because my understanding is that
>>> the idea is to generate RTP-packets according to RFC4733, not to insert
>>> tones in the audio. This means that "sendDTMF" has really no meaning
>>> outside a PeerConnection.
>>>
>>> I understand that this means that there are some other things that has
>>> to be met:
>>> * There must be an audio MediaStreamTrack in at least one of the
>>> localStream's (that the DTMF RTP packets can share SSRC with)
>>> * If there are several outgoing audio RTP streams (having different
>>> SSRC's), it must be possible to understand (control?) which SSRC that will
>>> be reused by DTMF.
>>>
>>> My very simple proposal for this would be that the DTMF RTP packets will
>>> share SSRC with the first audio track of the first MediaStream that has at
>>> least one audio track. If there is no such MediaStream in localStream's,
>>> then "sendDTMF" will fail.
>>>
>> It is important that it is possible to send DTMF without any request for
>> microphone access if the call is purely to an informational IVR where the
>> caller is never expected to speak, but still needs to navigate that IVR.
>> Similarly there are cases where DTMF may be required but the call is video
>> only, with no audio component.
>>
>> How should we handle these cases? Can we create a null audio track using
>> the current API?
>>
>>  In Santa Clara, Chris Rogers suggested extending the Audio API with
> objects that create audio streams. It would seem simple to define such an
> object that generates silence.
>
> Unlike Stefan, I think the API makes most sense if it's an API on a
> MediaStreamTrack object. If it is an API on a PeerConnection, it would have
> to be something like PeerConnection.DTMF(StreamID, TrackID, "12345"), which
> seems somewhat bizarre. It could easily be defined to generate failure if
> the other end of the MediaStreamTrack is not a playout into a
> PeerConnection.
>

This matches my viewpoint. We've created a nice object-oriented API, so I'd
like to maintain that design as much as possible, even if means a bit more
implementation work.

Followup question: should we define a specific AudioMediaStreamTrack that
inherits from MediaStreamTrack, and only expose this DTMF API on
AudioMediaStreamTracks? Or should we expose it from all tracks, and have it
throw an exception on tracks that don't support DTMF? And how should apps
know if DTMF is supported?

My suggestion would be to introduce AudioMediaStreamTrack, and have the
DTMF API fail if the other side doesn't support telephone-event. Support
for telephone-event can be determined from parsing the incoming SDP (with
ROAP), or the PeerConnection.remoteDescription method (with JSEP).

Received on Wednesday, 30 November 2011 04:25:49 UTC