- From: Timothy B. Terriberry <tterriberry@mozilla.com>
- Date: Tue, 04 Oct 2011 03:49:57 -0700
- To: public-webrtc@w3.org
What -- I'd like to propose that we add an API for providing a "hints" object to local media streams as a JSON-style associative array. Why -- Media used for different types of applications may benefit significantly from different types of processing. For example, audio captured from consumer-grade handset or headset microphones should be subjected to automatic gain control (AGC), noise filtering (such as high-pass filtering to remove breath noise, etc.), enhancement (such as noise shaping to emphasize formants for better understandability), and should be transmitted using codecs or codec modes specifically designed for speech, using discontinuous transmission (DTX) covered by comfort noise generation (CNG) on the receiver side. On the other hand, music or other audio captured from studio-quality equipment (see, e.g., use case 4.2.9 in the -05 use cases draft), pre-recorded files, or generated locally may be significantly impaired by such processing, may require different codecs or codec operating modes, and may be distorted by aggressive DTX thresholds. To be specific, the introduction of an adaptive high-pass filter in the SILK encoder (used in Opus) gave one of the largest call-quality improvements of any single feature, however when applied to music it can remove entire instruments. In order to support a wide range of applications, they need some way to signal to the browser what kind of processing is appropriate for a given media stream. A general hints mechanism gives them an extensible way to do that. How -- In the simplest form, I propose adding a JSONHints object as an argument to PeerConnection.addStream(): void addStream(MediaStream stream, JSONHints hints); The JSONHints object is a simple JSON-style associative array: JSONHints { "audioApplication": "General | VOIP", "videoApplication": "General | HeadAndShoulders", /* etc., TBD */ } A more complex approach would be to make the hints object an attribute of a MediaStream or MediaStreamTrack, as described in https://github.com/mozilla/rainbow/wiki/RTC_API_Proposal , but I'd like to see how far we can go with the simple approach first. One advantage of this approach is that the hints can only be set when the stream is added, meaning browsers may set up their internal media processing pipeline based on them, and don't have to handle the complexity of the application being able to change them at any time. In particular, if changing a hint requires choosing a different codec, O/A might need to be re-run. In this approach, the only way to change the hints of a stream is to remove it and re-add it, which already implies the need to re-run O/A. Compared with the above-mentioned proposal, which uses an onTypeChanged callback which can possibly generate a new MediaStreamTrack, I think this approach is much simpler, and easier to use, since in 99.9% of cases you will want to set the hints once, and never change them. Another advantage is that, by making this an argument to addStream(), it isn't as easy to lose track of the hints associated with a MediaStream. For example, if the hints were an attribute set on the MediaStream returned by getUserMedia(), and then it was cloned into a new MediaStream object to select a subset of the tracks, or run through a ProcessedMediaStream to apply some effect (possibly combining it with other MediaStreams with different hints), the semantics for propagating these hints would need to be defined. Trying to have this happen "automatically" is a good way to get it wrong (especially in the ProcessedMediaStream case). By asking for the hints exactly where they're going to be used, that kind of complexity is avoided. The disadvantage is that there's no way to query the hints after you've set them, but an application can simply hold on to the object it passed in on its own. Another place that might benefit from the addition of a JSONHints argument is MediaStream.record(). Whether this is sufficient, or a more explicit API is needed for things like codec selection, resolution, framerate, etc., however, is a different question. I think hints are most appropriate for settings which the browser should be free to ignore if it doesn't want to implement them.
Received on Tuesday, 4 October 2011 10:50:24 UTC