Re: [Bug 18485] Change DTMF API to be on PeerConnection

On 8/8/2012 10:20 AM, Stefan Hakansson LK wrote:
> On 08/08/2012 04:00 PM, Randell Jesup wrote:
>> On 8/8/2012 8:43 AM, Stefan Hakansson LK wrote:
>>> On 08/08/2012 12:52 PM, Harald Alvestrand wrote:
>>>> [Continuing discussion on list]
>>>>
>>>> I updated the bug in order to solicit views from the WG - would you
>>>> prefer A, B or C?
>>
>>>>> As for alternatives B and C, I think that the tones should not be
>>>>> inserted in
>>>>> the same MediaStream as the outgoing DTMF. The tones are to be used
>>>>> for local
>>>>> feedback, and you would not like to play the other outgoing audio
>>>>> locally.
>>>>>
>>>>> I think we should go for alternative A initially.
>>>> My thought was that if an app wants ringback of the tones, he does:
>>>>
>>>> incomingStream = pc.remoteStreams[0].audioTracks[0]
>>>> outgoingStream = pc.remoteStreams[0].audioTracks[0]
>>> Guess it should read
>>> outgoingStream = pc.localStreams[0].audioTracks[0]
>>>>
>>>> pc.sendDTMF(outgoingStream, "12345")
>>>> pc.sendDTMF(incomingStream, "12345")
>>>>
>>>> the two should then play out at ~ the same time.
>>>>
>>>> I don't want to force there to be always ringback present - that's app
>>>> dependent.
>>
>> Agree.
>>
>>> I agree, and this makes sense. Personally I don't have a strong 
>>> opinion,
>>> we could go for A (the web author wanting local feedback could easily
>>> accomplish that with a audio element and some files with tones), B 
>>> or C.
>>
>> This causes problems for speakerphone situations: in many/most
>> implementations (including those based on the webrtc.org code),
>> <audio>/<video> elements not part of the core webrtc logic may not be
>> fed into the echo canceller.  This means the tone would echo
>> uncontrolled into the microphone and to the far end, but distorted and
>> out of phase with local generation of DTMF (or for IVR systems, possibly
>> cause confusion, though probably not).
>
> If this is a real issue (I guess it depends on the implementation in 
> the browser and the underlying system) then we should avoid A. But 
> that would also mean that no other sounds could be produced while 
> using webrtc. Imagine that you get an email (in your web client) or 
> chat message and you have audio notifications enabled, those shouldn't 
> echo to the far end (or should they?).
>
>>
>>> We could even consider an alternative D:
>>>
>>> pc.canSendDTMF(MediaStreamTrack)
>>> pc.sendDTMF(MediaStreamTrack, tones, duration, optional 
>>> MediaStreamTrack)
>>>
>>> where tones are inserted in the audio of the second (optional)
>>> MediaStreamTrack supplied.
>>
>> Personally, I'd have SendDTMF() operate on a MediaStreamTrack, and be an
>> event.  This would mean that if you have a MediaStreamTrack connected to
>> two PeerConnections (quite possible), the event would be cloned and
>> bubble up to both PeerConnections, which would then send DTMF (if
>> possible).  I'd have PlayDTMF() operate on a track in remote or
>> localstream via PeerConnection, and it would insert tones.
>>
>> However, this doesn't mean you have to change things; it could be the
>> app's responsibility to call pc.SendDTMF when a DTMF event arrives on
>> the MediaStream, instead of that being an automatic-but-overridable
>> behavior.  My general preference is to provide default actions for these
>> sorts of things, though, so simple apps can be simple.
>
> I agree in principle, but to me DTMF is only relevant for apps 
> interoperating with legacy, and in those cases you need to also deploy 
> e.g. an ICE terminating GW - in my view we could allow DTMF use to be 
> a bit more complex since it is not relevant for the simple 
> (browser-browser) case anyway.

That's kind of my point - many apps may not know (especially in 
federated cases) that they're talking to a PSTN legacy gateway and to an 
IVR.  I'll agree an app designed to be a virtual softphone from a 
company that provides PSTN gateways would likely include whatever is 
needed to get working DTMF.  So that why I suggest include a simple 
default action on the event (assuming we define MediaStream events, 
which I've suggested) and let the app override that.

All that said, this isn't a big issue either way.  Echo-cancelling local 
feedback is more important.

-- 
Randell Jesup
randell-ietf@jesup.org

Received on Wednesday, 8 August 2012 14:30:20 UTC