- From: Stefan Håkansson LK <stefan.lk.hakansson@ericsson.com>
- Date: Tue, 12 Jul 2011 13:38:00 +0200
- To: "Timothy B. Terriberry" <tterriberry@mozilla.com>
- CC: "public-webrtc@w3.org" <public-webrtc@w3.org>
>>> Perhaps one example is the sort of thing described by >>> MediaStreamTrackHints in the proposal. The Opus audio codec from the >>> IETF standardization effort can switch between separate "voip" and >>> "audio" coding modes. The script setting up the connection may have >>> context information about which of these are more appropriate for its > >> This is why the Web uses a declarative high-level model. It lets us make >> the Web better in the future. > >I would argue the "voip" versus "audio" modes _are_ high-level >declarations. Underneath they mean doing things like high-pass filtering >to remove background noise and formant emphasis, automatic gain control, >and other things that improve call quality substantially in the voip >mode, but would destroy the quality of music or other general-purpose >audio that isn't sourced from a low-quality microphone. Exactly what's >done is up to the UA, but the intent is important. If we do this, isn't there a big risk that we would enter a discussion on what declarations could help a codec perform at its best? This could become quite a large set of declarations (if you take video, information about the scene like "sports", "talking head", "indoor", "outdoor", .... could help the codec perform at its optimum). And this set could be irrelevant for a new generation of codecs. "audio" vs "voip" is just one example, and it is specific for one codec. I think the general trend also is that codecs get better and better on analysing the input signal to make the best of the situation. Many audio/speech codecs automatically change their processing depending on the input signal. I think we should not go down this path, at least not now. Stefan (in role of contributor)
Received on Tuesday, 12 July 2011 11:38:24 UTC