Re: V4 comments (Re: Settings API Proposal v3 feedback summary) from Harald Alvestrand on 2012-10-06 (public-media-capture@w3.org from October 2012)

From: Harald Alvestrand <harald@alvestrand.no>
Date: Sat, 06 Oct 2012 13:09:46 +0200
To: Travis Leithead <travis.leithead@microsoft.com>
CC: "public-media-capture@w3.org" <public-media-capture@w3.org>
Message-ID: <5070117A.1040204@alvestrand.no>
On 10/05/2012 08:04 PM, Travis Leithead wrote:
>> From: Harald Alvestrand [mailto:harald@alvestrand.no]
>>
>> On 10/03/2012 03:10 AM, Travis Leithead wrote:
>>> V4 is now ready. I've posted it on Mercurial for easier reading:
>>>
>>> http://dvcs.w3.org/hg/dap/raw-file/tip/media-stream-
>> capture/proposals/SettingsAPI_proposal_v4.html
>>> Thanks! Looking forward to your continued feedback.
>>>
>> Thanks Travis!
>>
>> I like the way this is moving - in particular, I like the idea of
>> reducing the number of places where apps have to care what kind of
>> stream or track they're handling.
>>
>> Some worries persist, however:
>>
>> - I really like the idea in section 3 where you model the request
>> interface as "construct a constraints object and apply it". This unifies
>> the constraints namespace and the namespace for requested attributes
>> quite neatly.
>> HOWEVER - the namespaces are a bit non-overlapping, in particular the
>> control of height and width, where you introduce a new enumerated
>> "dimension" namespace. When items in both namespaces collide, do we
>> synthesize an union, or do we declare a winner?
> My concept of constraints is non-persistent, which is where I think this
> confusion is coming from. You seem to have the view that constraints persist,
> and that whatever changes the applications wants to make after an initial
> set of constraints have been established must be layered on top of those
> initial constraints. Hence there's the notion of collisions, unions, etc.,
> and the need to deal with the complexity that introduces.
>
> In my view (the way I've structured the proposal), constraints are
> non-persistent, and behave conceptually similar to a "picker" or "search"
> API. The result of applying constraints produces a single device that has been
> suitably configured to match the requested constraints. (Note that getUserMedia
> actually allows for at most two [different] devices to be selected, but this
> can be thought of as two separate applications of constraints to each class
> of device.) There is no "memory" associated with the previous constraints
> after they have been applied. Having selected and configured a device, the
> previous constraints are forgotten.
>
> Thus when you start a set of "request()" changes to change the settings of
> a particular device, this is building up a brand-new set of constraints to
> apply to the device (which could contradict a previous set of constraints
> that were applied). If it didn't work this way, then it would be very
> challenging to configure a setting to the minimum, then later to the maximum
> and then back to the minimum again, as a particular scenario (or end-user) might
> want to do.
I see where we differ now .... since I'm assuming from the get-go that 
the UA will need to dynamically adapt to the conditions of CPUs, 
resources and network bandwidth available, I've just assumed that the 
min/max constraints applied at creation time establish a space within 
which the UA is not only able to pick a current value, it is able to 
change to a differing operation point within that range without 
requiring ay interaction with the Javascript application.

This kind of adaptation happens today, and not only in network related 
matters; for instance, the Mac iSight cameras are famous for dropping 
their framerate in low light conditions - trading picture quality for 
framerate.

That's also the background for my continued preference for "the range of 
acceptable operation points" rather than "this specific mode" as what 
gets passed from the user's side.

Good to have this explicitly framed!

>> - It's not clear to me what happens to existing constraints placed on a
>> track after a series of "request" calls and subsequent applications; are
>> the constraint lists merged into an intersection? are they replaced? or
>> is this device and implementation dependent?
>> I'm particularly thinking of usage patterns where an implementation uses
>> constraints at the getUserMedia phase to get a suitable device, and then
>> uses the request interface to manipulate it afterwards; is he allowed to
>> set constraints outside the originally specified ones, or isn't he?
>> Not clear what the right answer is, but there should be only one.
> I hope my previous explanation made the answer to the above questions clear.
>
>
>> - I've worried about this before and gotten some pushback, but I'm still
>> worried about the wisdom of setting specific values rather than
>> "acceptable ranges" - for instance, some systems will drop resolution,
>> framerate or encoding complexity based on CPU temperature, to avoid
>> systems shutting down mid-call under "hot" conditions. Setting specific
>> values will seem to take away that freedom; setting a range of "values I
>> can live with" would preserve it.
> Interesting. My proposal doesn't account for the user agent dynamically
> changing the min/max ranges of the settings. In practice, I don't know how
> important this case really is? For example, if the device becomes too hot,
> then as an implementer, I'd just turn off the device to let it cool down!
And cut the videoconference in mid-sentence, rather than continuing at a 
lower quality?
We would not get a good user feedback from that.
>
> I'm pretty confident that setting specific values is desirable, given the
> settings tell the developer exactly what the acceptable value ranges are.
Nit: In many cases, there is no source that can be queried for the list 
of acceptable settings; certain camera/driver combinations will happily 
accept any setting you request, and just scale the stuff coming off the 
hardware to fit, without informing you that this is going on. It's a 
strange world down there....
> I also envision a use case where these settings are likely to be hooked up
> directly to the application's UI, in which case a select control or input
> range control will represent the various settings, and the application just
> passes those values directly into the request() API to apply the changes.
>
>
>> - I wonder if experimentation and flexibility could be served if we
>> could generalize the interface, so that we don't have to rev the API
>> every time we add a setting - the long list of settings in
>> PictureAndVideoSettings could possibly be expressed as
>>
>> interface MediaSetting {
>>       DOMString name;
>>       (MediaSettingsList or MediaSettingsRange) setting;
>> }
>>
>> interface PictureAndVideoSettings {
>>      MediaSetting[] settings;
>> }
>>
>> I don't know if it is wise, but I'd like to hear others' thoughts on
>> such an API:
> I definitely thought about this. However, there are many more drawbacks to
> this approach (in my opinion) than otherwise:
>
> 1. The settings list makes feature detection much more difficult.
How?
> 2. The settings list moves the actual settings two levels deeper off of the
>     object. (Harder to use.)
> 3. The settings list does not solve name collisions (it's still a problem that
>     would need to be dealt with).
In constraints, a registry is proposed as the solution to collisions. 
Wouldn't that work here too?
> 4. The settings list does not make it easier/more flexible to rev the API
>     (partial interfaces in WebIDL allow simple extensions for v2 specs)
I'm not so worried about future revs of the API, I'm worried about 
future revs of the implementations.
New APIs take 2 years to complete; new implementations take six weeks.
>
>
>> - lastly, I think that in some cases, for some constraints, it makes
>> sense to offer a media settings interface on *any* media stream track,
>> including remote ones. I wonder if we can generalize it to be an
>> interface on MediaStream itself without hurting it?
> I agree that this hypothetical constraint could exist. If we wanted to add it
> we could simply define it in terms of the proposed interface like so:
>
> partial interface MediaStreamTrack {
>     readonly attribute MediaSettingsRange? hypotheticalSetting;
> };
>
> However, I think we've gotten to the point where we need to stop talking about
> hypotheticals, and talk about actual settings and actual constraints--something that
> implementers can start building with confidence. To make progress along those lines
> my proposal puts a stake in the ground about what settings and constraints could be
> relevant for device capture in v1. We should have a conversation about what track types
> might be necessary for RTCPeerConnection, and what settings/constraints could be
> exposed on those objects (though it's somewhat orthogonal to completing the Media Capture
> and Streams spec IMO).
OK, I'll make the specific requirements clear from the requirements we 
(Google Chrome Media) are being fed:

Horizontal size, vertical size, aspect ratio, frame rate and (max / 
target) bitrate.

The last one is a constraint that can only make sense in the context of 
a PeerConnection.
There are several scenarios where the 4 first ones make sense to request 
from either an incoming or an outgoing MediaStream, where the decision 
on how it is implemented (whether by changing the camera configuration 
or by inserting a scaling function in the pipeline) should be taken by 
the UA depending on the configuration-of-the-moment.

For instance, a camera may feed a local preview, a LAN-carried 
HD-quality feed into a TV studio, and a 3G-carried feed into a mobile 
phone-based monitor. It may make sense for the UA to choose to use a 
rescaler in this case for the mobile-phone based monitor.

The application has to be able to express those desires. The UA needs to 
figure out the best ways to satisfy them.


>
Received on Saturday, 6 October 2012 11:10:18 UTC