- From: Timothy B. Terriberry <tterriberry@mozilla.com>
- Date: Tue, 04 Oct 2011 03:49:57 -0700
- To: public-webrtc@w3.org
What
--
I'd like to propose that we add an API for providing a "hints" object to
local media streams as a JSON-style associative array.
Why
--
Media used for different types of applications may benefit significantly
from different types of processing. For example, audio captured from
consumer-grade handset or headset microphones should be subjected to
automatic gain control (AGC), noise filtering (such as high-pass
filtering to remove breath noise, etc.), enhancement (such as noise
shaping to emphasize formants for better understandability), and should
be transmitted using codecs or codec modes specifically designed for
speech, using discontinuous transmission (DTX) covered by comfort noise
generation (CNG) on the receiver side. On the other hand, music or other
audio captured from studio-quality equipment (see, e.g., use case 4.2.9
in the -05 use cases draft), pre-recorded files, or generated locally
may be significantly impaired by such processing, may require different
codecs or codec operating modes, and may be distorted by aggressive DTX
thresholds. To be specific, the introduction of an adaptive high-pass
filter in the SILK encoder (used in Opus) gave one of the largest
call-quality improvements of any single feature, however when applied to
music it can remove entire instruments. In order to support a wide range
of applications, they need some way to signal to the browser what kind
of processing is appropriate for a given media stream. A general hints
mechanism gives them an extensible way to do that.
How
--
In the simplest form, I propose adding a JSONHints object as an argument
to PeerConnection.addStream():
void addStream(MediaStream stream, JSONHints hints);
The JSONHints object is a simple JSON-style associative array:
JSONHints {
"audioApplication": "General | VOIP",
"videoApplication": "General | HeadAndShoulders",
/* etc., TBD */
}
A more complex approach would be to make the hints object an attribute
of a MediaStream or MediaStreamTrack, as described in
https://github.com/mozilla/rainbow/wiki/RTC_API_Proposal , but I'd like
to see how far we can go with the simple approach first.
One advantage of this approach is that the hints can only be set when
the stream is added, meaning browsers may set up their internal media
processing pipeline based on them, and don't have to handle the
complexity of the application being able to change them at any time. In
particular, if changing a hint requires choosing a different codec, O/A
might need to be re-run. In this approach, the only way to change the
hints of a stream is to remove it and re-add it, which already implies
the need to re-run O/A. Compared with the above-mentioned proposal,
which uses an onTypeChanged callback which can possibly generate a new
MediaStreamTrack, I think this approach is much simpler, and easier to
use, since in 99.9% of cases you will want to set the hints once, and
never change them.
Another advantage is that, by making this an argument to addStream(), it
isn't as easy to lose track of the hints associated with a MediaStream.
For example, if the hints were an attribute set on the MediaStream
returned by getUserMedia(), and then it was cloned into a new
MediaStream object to select a subset of the tracks, or run through a
ProcessedMediaStream to apply some effect (possibly combining it with
other MediaStreams with different hints), the semantics for propagating
these hints would need to be defined. Trying to have this happen
"automatically" is a good way to get it wrong (especially in the
ProcessedMediaStream case). By asking for the hints exactly where
they're going to be used, that kind of complexity is avoided.
The disadvantage is that there's no way to query the hints after you've
set them, but an application can simply hold on to the object it passed
in on its own.
Another place that might benefit from the addition of a JSONHints
argument is MediaStream.record(). Whether this is sufficient, or a more
explicit API is needed for things like codec selection, resolution,
framerate, etc., however, is a different question. I think hints are
most appropriate for settings which the browser should be free to ignore
if it doesn't want to implement them.
Received on Tuesday, 4 October 2011 10:50:24 UTC