Re: Constraints structure and Capabilities API from Rich Tibbett on 2012-02-24 (public-webrtc@w3.org from February 2012)

From: Rich Tibbett <richt@opera.com>
Date: Fri, 24 Feb 2012 17:53:31 +0100
To: Randell Jesup <randell-ietf@jesup.org>
CC: public-webrtc@w3.org
Message-ID: <4F47C08B.50109@opera.com>
Randell Jesup wrote:
> On 2/24/2012 7:23 AM, Rich Tibbett wrote:
>>
>> Media that is going to be sent over a p2p connection and data that is
>> simply intended for local playback (e.g. as the back for an AR app),
>> local recording (e.g. for conference/dating/social network
>> introductions and pre-recorded messages) or local manipulation (e.g.
>> barcode scanning, face recognition) inherently have very different
>> properties.
>
> Agreed (not 100% sure about 'very', but still, agreed).

Essentially, a lot of the min/max capability configuration under 
discussion only really make sense for downstream APIs such as media 
recording and p2p and should probably be decoupled from gUM.

>
>>
>> I think focus on the p2p use case has been at the detriment of
>> consideration of local use cases. In all three of the local cases
>> above that do not require peer-to-peer streaming it would be ideal
>> simply to have the highest quality video and audio that can be
>> provided by the UA returned for local usage.
>
> I'll just note that "highest quality" is a very fluid thing in video. Is
> it highest framerate? Highest resolution? How does light level affect
> it?Noise level (related to light)?

It's native framerate and native resolution. Light/Noise balance should 
be auto configured in the implementation. Developers shouldn't (and I'll 
suggest won't) go to this level of configuration the majority of the time.

> Is consistent framerate important?

We could discuss this further but it may make sense to have some 
consistency in this regard.

> What happens when a camera has a built-in encoder (and some do, now)?

Codecs and encoding have little actual value if it's simply a pipe to a 
local video element. This becomes significant only for downstream APIs 
and the hooks for developers to select characterisitics for the encoding 
should be applied at that level..if at all since e.g. UDP negotiation 
gets us a long way towards knowing what we actually need rather than 
what we want.

> What if the app wants to process the data (image recognition, etc), but
> to reduce processing load (or reduce low-light noise) wants a lower
> resolution or lower framerate or both?

This was one of the points in my previous email. I think there is a case 
to limit the sheer amount of pixel data you get back in some cases. 
Specifically I'm thinking about real-time analysis use cases but I see 
that as one flag hint rather than a set of min/max properties.

>
> So just be very careful about assuming that 'highest quality' is what
> you want for those local cases, and that it has any sort of fixed
> definition.

Highest quality means the max values that the native cam/OS/UA supports. 
It could be made to mean a fixed profile for framerate / resolution if 
it turns out it is more useful to have such a profile (e.g. developer 
expectation that they should always expect to receive and process the 
same amount of frames, pixels, etc.

If we do decide to go with a large set of capabilities and constraints 
that we can configure via JavaScript it would still be good for the 
majority of users that don't configure these properties to expect 
something of high quality that is likely to cover whatever use cases 
they have in mind for client-side development with those streams being 
adapted at the point they are hooked up to a downstream API such as 
PeerConnection.

- Rich
Received on Friday, 24 February 2012 16:54:09 UTC