- From: Harald Alvestrand <harald@alvestrand.no>
- Date: Wed, 12 Oct 2011 18:26:00 +0200
- To: "Timothy B. Terriberry" <tterriberry@mozilla.com>
- CC: public-webrtc@w3.org
On reviewing the debate on this topic, I conclude that the change suggested is uncontroversial and represents the consensus of the group. The editors should add proposed language for this parameter to the specification. NOTE: I don't think the form of the content of the "hints" object has any degree of consensus yet; whether we say Hints { "audio" : { "general": "voip" } } or Hints { "audioApplication": "General | VOIP" } doesn't seem like it's had any discussion so far; registration procedures for new parameter values is another nice source of complexity. On 10/04/11 12:49, Timothy B. Terriberry wrote: > What > -- > I'd like to propose that we add an API for providing a "hints" object > to local media streams as a JSON-style associative array. > > Why > -- > Media used for different types of applications may benefit > significantly from different types of processing. For example, audio > captured from consumer-grade handset or headset microphones should be > subjected to automatic gain control (AGC), noise filtering (such as > high-pass filtering to remove breath noise, etc.), enhancement (such > as noise shaping to emphasize formants for better understandability), > and should be transmitted using codecs or codec modes specifically > designed for speech, using discontinuous transmission (DTX) covered by > comfort noise generation (CNG) on the receiver side. On the other > hand, music or other audio captured from studio-quality equipment > (see, e.g., use case 4.2.9 in the -05 use cases draft), pre-recorded > files, or generated locally may be significantly impaired by such > processing, may require different codecs or codec operating modes, and > may be distorted by aggressive DTX thresholds. To be specific, the > introduction of an adaptive high-pass filter in the SILK encoder (used > in Opus) gave one of the largest call-quality improvements of any > single feature, however when applied to music it can remove entire > instruments. In order to support a wide range of applications, they > need some way to signal to the browser what kind of processing is > appropriate for a given media stream. A general hints mechanism gives > them an extensible way to do that. > > How > -- > In the simplest form, I propose adding a JSONHints object as an > argument to PeerConnection.addStream(): > > void addStream(MediaStream stream, JSONHints hints); > > The JSONHints object is a simple JSON-style associative array: > > JSONHints { > "audioApplication": "General | VOIP", > "videoApplication": "General | HeadAndShoulders", > /* etc., TBD */ > } > > A more complex approach would be to make the hints object an attribute > of a MediaStream or MediaStreamTrack, as described in > https://github.com/mozilla/rainbow/wiki/RTC_API_Proposal , but I'd > like to see how far we can go with the simple approach first. > > One advantage of this approach is that the hints can only be set when > the stream is added, meaning browsers may set up their internal media > processing pipeline based on them, and don't have to handle the > complexity of the application being able to change them at any time. > In particular, if changing a hint requires choosing a different codec, > O/A might need to be re-run. In this approach, the only way to change > the hints of a stream is to remove it and re-add it, which already > implies the need to re-run O/A. Compared with the above-mentioned > proposal, which uses an onTypeChanged callback which can possibly > generate a new MediaStreamTrack, I think this approach is much > simpler, and easier to use, since in 99.9% of cases you will want to > set the hints once, and never change them. > > Another advantage is that, by making this an argument to addStream(), > it isn't as easy to lose track of the hints associated with a > MediaStream. For example, if the hints were an attribute set on the > MediaStream returned by getUserMedia(), and then it was cloned into a > new MediaStream object to select a subset of the tracks, or run > through a ProcessedMediaStream to apply some effect (possibly > combining it with other MediaStreams with different hints), the > semantics for propagating these hints would need to be defined. Trying > to have this happen "automatically" is a good way to get it wrong > (especially in the ProcessedMediaStream case). By asking for the hints > exactly where they're going to be used, that kind of complexity is > avoided. > > The disadvantage is that there's no way to query the hints after > you've set them, but an application can simply hold on to the object > it passed in on its own. > > Another place that might benefit from the addition of a JSONHints > argument is MediaStream.record(). Whether this is sufficient, or a > more explicit API is needed for things like codec selection, > resolution, framerate, etc., however, is a different question. I think > hints are most appropriate for settings which the browser should be > free to ignore if it doesn't want to implement them. > >
Received on Wednesday, 12 October 2011 16:26:51 UTC