W3C home > Mailing lists > Public > public-webrtc@w3.org > October 2011

Re: CHANGE: Provide JSONHints interface for media streams

From: Stefan Håkansson <stefan.lk.hakansson@ericsson.com>
Date: Thu, 6 Oct 2011 14:47:14 +0200
Message-ID: <4E8DA352.4030604@ericsson.com>
To: public-webrtc@w3.org
This sounds like a good idea. But would not the natural time to add 
"hints" be at getUserMedia time?

My reasoning being that the "hints" (as you say below) are in 99.9% of 
the time set once and never changed, and can in addition have an effect 
already earlier in the chain (before the codec). E.g. if a hint is "fast 
moving content" the camera might be set up to record with a higher frame 
rate - the effect would be visible in a view finder even if the 
MediaStream is never transported off the device using a PeerConnection.


On 10/05/2011 05:32 PM, Cullen Jennings wrote:
> Let me add a +1 for this and also mention this was discussed in my
> slides at last face to face meeting and I think there was fairly good
> support for something along these lines.
> Cullen
> On Oct 4, 2011, at 4:49 AM, Timothy B. Terriberry wrote:
>> What -- I'd like to propose that we add an API for providing a
>> "hints" object to local media streams as a JSON-style associative
>> array.
>> Why -- Media used for different types of applications may benefit
>> significantly from different types of processing. For example,
>> audio captured from consumer-grade handset or headset microphones
>> should be subjected to automatic gain control (AGC), noise
>> filtering (such as high-pass filtering to remove breath noise,
>> etc.), enhancement (such as noise shaping to emphasize formants for
>> better understandability), and should be transmitted using codecs
>> or codec modes specifically designed for speech, using
>> discontinuous transmission (DTX) covered by comfort noise
>> generation (CNG) on the receiver side. On the other hand, music or
>> other audio captured from studio-quality equipment (see, e.g., use
>> case 4.2.9 in the -05 use cases draft), pre-recorded files, or
>> generated locally may be significantly impaired by such processing,
>> may require different codecs or codec operating modes, and may be
>> distorted by aggressive DTX thresholds. To be specific, the
>> introduction of an adaptive high-pass filter in the SILK encoder
>> (used in Opus) gave one of the largest call-quality improvements of
>> any single feature, however when applied to music it can remove
>> entire instruments. In order to support a wide range of
>> applications, they need some way to signal to the browser what kind
>> of processing is appropriate for a given media stream. A general
>> hints mechanism gives them an extensible way to do that.
>> How -- In the simplest form, I propose adding a JSONHints object as
>> an argument to PeerConnection.addStream():
>> void addStream(MediaStream stream, JSONHints hints);
>> The JSONHints object is a simple JSON-style associative array:
>> JSONHints { "audioApplication": "General | VOIP",
>> "videoApplication": "General | HeadAndShoulders", /* etc., TBD */
>> }
>> A more complex approach would be to make the hints object an
>> attribute of a MediaStream or MediaStreamTrack, as described in
>> https://github.com/mozilla/rainbow/wiki/RTC_API_Proposal , but I'd
>> like to see how far we can go with the simple approach first.
>> One advantage of this approach is that the hints can only be set
>> when the stream is added, meaning browsers may set up their
>> internal media processing pipeline based on them, and don't have to
>> handle the complexity of the application being able to change them
>> at any time. In particular, if changing a hint requires choosing a
>> different codec, O/A might need to be re-run. In this approach, the
>> only way to change the hints of a stream is to remove it and re-add
>> it, which already implies the need to re-run O/A. Compared with the
>> above-mentioned proposal, which uses an onTypeChanged callback
>> which can possibly generate a new MediaStreamTrack, I think this
>> approach is much simpler, and easier to use, since in 99.9% of
>> cases you will want to set the hints once, and never change them.
>> Another advantage is that, by making this an argument to
>> addStream(), it isn't as easy to lose track of the hints associated
>> with a MediaStream. For example, if the hints were an attribute set
>> on the MediaStream returned by getUserMedia(), and then it was
>> cloned into a new MediaStream object to select a subset of the
>> tracks, or run through a ProcessedMediaStream to apply some effect
>> (possibly combining it with other MediaStreams with different
>> hints), the semantics for propagating these hints would need to be
>> defined. Trying to have this happen "automatically" is a good way
>> to get it wrong (especially in the ProcessedMediaStream case). By
>> asking for the hints exactly where they're going to be used, that
>> kind of complexity is avoided.
>> The disadvantage is that there's no way to query the hints after
>> you've set them, but an application can simply hold on to the
>> object it passed in on its own.
>> Another place that might benefit from the addition of a JSONHints
>> argument is MediaStream.record(). Whether this is sufficient, or a
>> more explicit API is needed for things like codec selection,
>> resolution, framerate, etc., however, is a different question. I
>> think hints are most appropriate for settings which the browser
>> should be free to ignore if it doesn't want to implement them.
Received on Thursday, 6 October 2011 12:47:59 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 19:17:22 UTC