Re: Revised Constraints modification API proposal

Hi Travis,

Thanks for putting this together. Generally speaking there is a lot to like here!

CIL below...

On Aug 21, 2012, at 10:56 PM, Travis Leithead <travis.leithead@microsoft.com> wrote:

> Hi folks! 
> 
> In preparation for the upcoming Telco, I've adjusted the "Constraints 
> modification API" proposal, originally proposed in [1] based on feedback 
> from Harald and others, as well as incorporating ideas from Rich's recent 
> proposal [2]. I hope that this proposal furthers the convergence that we
> all seek. I took the liberty of writing this out in narrative form, so 
> please read through to the end to get the full picture of this proposal.
> 
> The rest of this proposal is broken into three sections:
> 1. Creating the "suitable" home for the APIs in question
> 2. Defining how to get capabilities
> 3. Defining how to apply constraints
> 
> As a reminder, the goal of this proposal is to facilitate "informed 
> constraints" (i.e., allow constraints to be applied after existing client 
> capabilities are known) in order to avoid potential pitfalls of blindly 
> over-constrained use of getUserMedia across a range of different devices.

Strongly agree with this objective.

> 
> A secondary goal is to provide the right set of APIs for uniformly working
> with the devices that supply the local media stream tracks, for future APIs
> and scenarios we may wish to add.
> 
> If this proposal is adopted, I would expect that the existing constraint
> usage in getUserMedia could be significantly scaled back, if not removed
> altogether.

As mentioned previously, Opera have no intention currently to implement up-front constraints at getUserMedia invocation. Therefore, Opera fully support this statement (i.e. removal altogether).

Modifying the capabilities of an existing track puts the onus on developers to handle MediaStreamTracks on a best-effort basis. Refusal to work within the parameters of the majority of user hardware or failure of developers to provide best-effort processing of MediaStreamTracks is to be considered a shortcoming of a web app rather than a restriction enforced by any UA implementation. A web app would need a _very_ good reason for refusing to work with any e.g. camera stream provided and if so, it would be for developers to explain to their users why they can't work with any given camera stream. That would be a situation that the majority of developers will want to avoid and so developers are implicitly tasked to take greater care to accomodate whatever stream they are given rather than only do half a job based on some biased perception of user's hardware they may have at development time. Constraints up front encourages developers to only do half a job rather than being inclusive by design.

> 
> Thanks!
> 
> -------------
> 
> 1. Create a suitable home for constraint application and capability 
>   retrieval that is strongly tied to the track concept.
> 
> In order to define a capabilities/constraints API that supports modifications,
> we need to carefully consider where it should live. I believe that it needs 
> to exist only for "LocalMediaStreams" so that there is no confusion/ambiguity 
> surrounding cloned MediaStreams or MediaStream objects obtained from a 
> PeerConnection (remotely over a network). 
> Note that providing a signaling channel or any other notification framework
> for constraint changes or capability requests over the network is out of scope 
> for this proposal.

Agree.

> 
> The existing tracks concept seems like a good place for this API. However,
> because tracks can be added/removed from various MediaStreams (and 
> LocalMediaStreams), there's no real guarantee that the tracks you're dealing
> with in the LocalMediaStream are the ones that are related to your "local" 
> device. It's also confusing for developers if some tracks have special "local"
> capabilities while others don't (in the same track list). This is a problem 
> that needs a solution if a consistent and reliable method for interacting with 
> just your "local" (approved) tracks is to be designed. I'll call this the "transient-
> track" problem.

Essentially, I think we want to avoid having to rely on typeof checking all the time per Object. It's always nicer to feature detect based on the presence or absence of a property or set of properties on a list of JavaScript objects. You kind of solve this be adding audioDevices and videoDevices below, which implicitly return a different kind of object. That works for me.

> 
> Note: I'm not totally satisfied with this proposal, because even LocalMediaStream
> objects are potentially transient, and if the developer loses the reference to the
> LocalMediaStream, then they're out of luck. It's also a little weird to have two 
> instances of LocalMediaStreams (from two calls to getUserMedia), but have basically
> the same information provided in parallel between them. Perhaps that's not a bit deal.
> Feedback, as always, is welcome.
> 
> A new LocalMediaStreamTrackList is defined in order to address the transient-track 
> problem noted above. These track lists are different from MediaStreamTrackLists in 
> one notable way:
> 
> * There is no ability to remove/add tracks to these lists. The addition and removal of
> tracks in these lists are managed exclusively by the user agent.
> 
> The new track list's purpose is to aggregate all the "local" media stream tracks
> together for enumeration, settings, and constraint application. It also serves
> as a convenient list of the "active" local devices that have been approved by the
> user.
> 
> // +++ New interface
> interface LocalMediaStreamTrackList {
>   readonly attribute unsigned long length;
>   getter MediaStreamTrack (unsigned long index);
>   attribute EventHandler onaddtrack;
>   attribute EventHandler onremovetrack;
> };
> 
> The developer can be notified when new local audio/video devices are enabled
> by the user (by subsequent calls to getUserMedia for example) or when local audio/
> video devices have stopped by registering for the [familiar] onaddtrack/onremovetrack
> event handlers.
> 
> There are not add/remove APIs, as the management of this list is done exclusively 
> by the user agent as previously noted.
> 
> The LocalMediaStreamTrackList is surfaced on the LocalMediaStream interface as two
> new properties:
> 
> // Existing definition...
> interface LocalMediaStream: MediaStream {
>   // Existing stop API
>   void stop ();
>   // +++ New
>   readonly attribute LocalMediaStreamTrackList audioDevices;
>   readonly attribute LocalMediaStreamTrackList videoDevices;
> };

Separating out devices from tracks makes a lot of sense. I assume always that audioTracks >= audioDevices (since each item in audioDevices also has a representation in audioTracks) ?

> 
> Each audio/video LocalMediaStreamTrackList is kept up-to-date among all 
> LocalMediaStream instances, so that it doesn't matter which instance of a
> LocalMediaStream object is used, it will always have the aggregate information
> for all active local audio/video devices [previously] approved by the user.

So the list of tracks returned from getUserMedia is mutable? That's a big change to the current proposal IIUC.

> 
> These two lists now represent (conceptually) the set of active devices
> supplying media streams on the user's local box. Each track within these 
> lists essentially represents a local device. So these lists are an enumeration
> of the users active approved devices (not necessarily all the devices that
> the user has available). Since these are approved devices, it follows that 
> requesting capabilities of and applying constraints to these devices is 
> acceptable without requesting additional permissions from the user.
> 
> Note that my proposal as-is doesn't describe how the developer can discover
> that there are more devices available.

We can probably leave this one to later. Initial thoughts are that we could add event listeners that inform when the total number of cameras/microphones available on the local device changes. A subsequent call to getUserMedia would still be required though to obtain potential access to those devices. It's in the developers best interests to say something like 'Hey, we notice you've attached another camera! [Click here] if you want to use this camera'. At which point getUserMedia can be called, the opt-in UI can be displayed and the process repeats.

> 
> 2. Get the capabilities in terms of Track objects.
> 
> In the proposal above, notice that the LocalMediaStreamTrackList contains a 
> list of MediaStreamTrack objects. As already proposed by Rich in [2], I also 
> recommend a factoring of MediaStreamTrack into two local derived types: 
> LocalVideoStreamTrack and LocalAudioStreamTrack (my names include the prefix 
> "local" for clarity):
> 
> // +++ new factored interface for video-specific APIs
> interface LocalVideoStreamTrack : MediaStreamTrack {
> }
> 
> // +++ new factored interface for audio-specific APIs
> interface LocalAudioStreamTrack : MediaStreamTrack {
> }
> 
> The most-derived interfaces are always returned from the LocalMediaStreamTrackList.
> 
> The current state of the track can be reflected in APIs added to these objects.
> As described in [2], for the LocalVideoStreamTrack these might include:
> * autoFocusMode
> * currentZoom
> * currentFlashMode
> * currentDisplayOrientation
> * viewingAngle
> * etc.
> LocalAudioStreamTracks might have:
> * currentAudioLevel
> 
> Each local track contains the "current" settings, which can be directly 
> inspected. In order to find the "range" of the available settings, 
> e.g., the capabilities of a given track, I propose a common API across 
> both track types:
> 
> typedef sequence<MediaTrackConstraint>? LocalDeviceCapabilites;
> LocalDeviceCapabilites getCapabilities(optional (MediaStreamTrack or unsigned long) track);
> 
> This API is located on the *track list* (i.e., LocalMediaStreamTrackList) 
> in order to be able to operate on either a single track (e.g., "give me the 
> capabilities for a single device") or all the available devices (e.g., "give
> me the capabilities for all the available devices"). (By "available", I mean
> those devices already approved by getUserMedia).

What if two different cameras/microphones have conflicting capabilities? How would they be expressed in a single call to getCapabilities without any arguments?

> 
> // New interface
> interface LocalMediaStreamTrackList {
>   readonly attribute unsigned long length;
>   getter MediaStreamTrack (unsigned long index);
>   attribute EventHandler onaddtrack;
>   attribute EventHandler onremovetrack;
>   // +++ capabilities API
>   LocalDeviceCapabilites getCapabilities(optional (MediaStreamTrack or unsigned long) track);
> };
> 
> If no parameter is provided, then the combined capabilities of all available tracks 
> for the given list (audio/video) is returned. If the developer only has a single track,
> the API is very simple:
> 
> var caps = localStream.videoDevices.getCapabilities();
> 
> If there are multiple video devices currently active, then the developer 
> uses the same code as above to get all the combined capabilities, or they can
> pick-and-choose:
> 
> // Get the capabilities for the second active video device
> var caps = localStream.videoDevices.getCapabilities(1);
> 
> The capabilities returned are the set of capability "ranges" that the device supports, 
> suitable for use in the constraints API.
> 
> 3. Apply constraints to Track objects
> 
> Now that the capabilities can be inspected for approved devices (all or per-track),
> constraints can be applied to them. As with getting the capabilities, constraint 
> application is applied either directly to a MediaStreamTrack or to all currently 
> active local tracks in a given list (audio/video).

Or IIUC, constraint application is applied either directly to a MediaStreamTrack or to all currently active local _device-originated_ tracks in a given list (audio/video).

> 
> // New interface
> interface LocalMediaStreamTrackList {
>   readonly attribute unsigned long length;
>   getter MediaStreamTrack (unsigned long index);
>   attribute EventHandler onaddtrack;
>   attribute EventHandler onremovetrack;
>   LocalDeviceCapabilites getCapabilities(optional (MediaStreamTrack or unsigned long) track);
>   // +++ Constraints API
>   void applySettings(MediaTrackConstraints, optional (MediaStreamTrack or unsigned long) track);
> };
> 
> I'm not a believer in track object mutation. In other words, I don't believe that
> a track should be able to be "in-place" updated by applying a constraint to it. 

I think I don't share this view :( Given a stream, that we pipe data through, it seems reasonable for that pipe to change its characteristics on the fly.

What I think is happening here is that we're using onaddtrack/onremovetrack to signal that capabilities have been applied. Perhaps that would be better with a callback argument in applySettings or via an event listener on each individual LocalMediaStreamTrack.

> Rather, my view is that if there is a change that the device makes in response
> to the application of new constraints, old tracks are stopped and new track objects
> are created in response. This makes the application of constraints implicitly 
> asynchronous, which prevents developers from taking a dependency on the "immediate
> application" of constraints to a given track which may not be possible for all devices.
> 
> The "applySettings" API (I renamed it to sound more friendly), acts on all the local 
> media tracks in the list by default, or targets only a specific track if one is 
> indicated in the 2nd parameter.

Again, what would happen if settings are conflicting between two devices? It seems better if settings can be queried and applied only per track. Applying settings to multiple tracks via a single call feels like it could be an optimisation rather than a strictly necessary addition.

> 
> If targeted toward a specific track, the new constraints are evaluated against 
> the current track settings, and if the current settings fall within the new 
> constraints then no change is made. If the constraints affect the current settings 
> of the track, then that track is stopped (and consequently removed from the
> LocalMediaStreamTrackList as a result) and if the device associated with that track
> can support the new constraints, then a new track is created and added to the
> LocalMediaStreamTrackList. (All these actions trigger the appropriate add/remove 
> events.)
> 
> If not targeted toward a specific track (applying to all the current active devices),
> the above algorithm is run for every track in the list.
> 
> To summarize:
> This proposal adds two new track lists to local media streams which correspond to
> the "active" local devices supplying the tracks which have been approved by
> getUserMedia. The proposal incorporates a getCapabilities and applySettings API onto
> these tracks lists which can operate on the list as a whole or on individual tracks
> in the list. This proposal also recommends that individual local tracks be augmented
> with device-specific "current" settings, though the details of these settings were 
> only alluded to via another proposal [2].

I believe this proposal is heading in the right direction. There are naturally still a few dark corners we need to explore but, in general, I like it. I strongly support the direction this API is moving in. I'd like to explore what settings we intend to allow via the applySettings call and discuss further on media stream track mutability. 

It feels like something Opera could actively support and I would be happy to contribute time to put this in to draft spec form with you.

Thanks,

Rich

> 
> [1] http://lists.w3.org/Archives/Public/public-media-capture/2012Jul/0069.html
> [2] http://lists.w3.org/Archives/Public/public-media-capture/2012Aug/0032.html
> 

Received on Wednesday, 22 August 2012 06:09:12 UTC