Re: CHANGE: Provide JSONHints interface for media streams

Thanks for the proposal, Tim!
I like the idea of starting out by attaching those hints conceptually to 
the PeerConnection/MediaStream interface, at least as a starting point.

One comment about vocabulary....
the term "JSON" can be used in two ways:

- a string containing a representation of a Javascript object, which can 
be reconstructed using a JSON parser
- a Javascript object that can be represented by such a string

I think the proposal looks as if you're arguing for the second - and in 
that case, I would suggest that the term "JSON" should not appear in the 
description; it's an object (a data-only object).

I'm no expert in writing these things, but I got chastised about using 
the term "JSON" by people who've done it more at least once, and feel 
the need to share the (hopefully correct) guidance....

On 10/04/2011 12:49 PM, Timothy B. Terriberry wrote:
> What
> -- 
> I'd like to propose that we add an API for providing a "hints" object 
> to local media streams as a JSON-style associative array.
>
> Why
> -- 
> Media used for different types of applications may benefit 
> significantly from different types of processing. For example, audio 
> captured from consumer-grade handset or headset microphones should be 
> subjected to automatic gain control (AGC), noise filtering (such as 
> high-pass filtering to remove breath noise, etc.), enhancement (such 
> as noise shaping to emphasize formants for better understandability), 
> and should be transmitted using codecs or codec modes specifically 
> designed for speech, using discontinuous transmission (DTX) covered by 
> comfort noise generation (CNG) on the receiver side. On the other 
> hand, music or other audio captured from studio-quality equipment 
> (see, e.g., use case 4.2.9 in the -05 use cases draft), pre-recorded 
> files, or generated locally may be significantly impaired by such 
> processing, may require different codecs or codec operating modes, and 
> may be distorted by aggressive DTX thresholds. To be specific, the 
> introduction of an adaptive high-pass filter in the SILK encoder (used 
> in Opus) gave one of the largest call-quality improvements of any 
> single feature, however when applied to music it can remove entire 
> instruments. In order to support a wide range of applications, they 
> need some way to signal to the browser what kind of processing is 
> appropriate for a given media stream. A general hints mechanism gives 
> them an extensible way to do that.
>
> How
> -- 
> In the simplest form, I propose adding a JSONHints object as an 
> argument to PeerConnection.addStream():
>
>   void addStream(MediaStream stream, JSONHints hints);
>
> The JSONHints object is a simple JSON-style associative array:
>
> JSONHints {
>   "audioApplication": "General | VOIP",
>   "videoApplication": "General | HeadAndShoulders",
>   /* etc., TBD */
> }
>
> A more complex approach would be to make the hints object an attribute 
> of a MediaStream or MediaStreamTrack, as described in 
> https://github.com/mozilla/rainbow/wiki/RTC_API_Proposal , but I'd 
> like to see how far we can go with the simple approach first.
>
> One advantage of this approach is that the hints can only be set when 
> the stream is added, meaning browsers may set up their internal media 
> processing pipeline based on them, and don't have to handle the 
> complexity of the application being able to change them at any time. 
> In particular, if changing a hint requires choosing a different codec, 
> O/A might need to be re-run. In this approach, the only way to change 
> the hints of a stream is to remove it and re-add it, which already 
> implies the need to re-run O/A. Compared with the above-mentioned 
> proposal, which uses an onTypeChanged callback which can possibly 
> generate a new MediaStreamTrack, I think this approach is much 
> simpler, and easier to use, since in 99.9% of cases you will want to 
> set the hints once, and never change them.
>
> Another advantage is that, by making this an argument to addStream(), 
> it isn't as easy to lose track of the hints associated with a 
> MediaStream. For example, if the hints were an attribute set on the 
> MediaStream returned by getUserMedia(), and then it was cloned into a 
> new MediaStream object to select a subset of the tracks, or run 
> through a ProcessedMediaStream to apply some effect (possibly 
> combining it with other MediaStreams with different hints), the 
> semantics for propagating these hints would need to be defined. Trying 
> to have this happen "automatically" is a good way to get it wrong 
> (especially in the ProcessedMediaStream case). By asking for the hints 
> exactly where they're going to be used, that kind of complexity is 
> avoided.
>
> The disadvantage is that there's no way to query the hints after 
> you've set them, but an application can simply hold on to the object 
> it passed in on its own.
>
> Another place that might benefit from the addition of a JSONHints 
> argument is MediaStream.record(). Whether this is sufficient, or a 
> more explicit API is needed for things like codec selection, 
> resolution, framerate, etc., however, is a different question. I think 
> hints are most appropriate for settings which the browser should be 
> free to ignore if it doesn't want to implement them.
>
>

Received on Wednesday, 5 October 2011 11:52:11 UTC