Re: Capabilities API proposal

On 1/20/2012 2:16 PM, Cullen Jennings wrote:
> First, on the topic of video resolution - I'm alway scared of a fixed set of labels such as cga, vga, xvga and so on - the problem is that apps get coded with a set of labels that corresponds to what they had at the time there war coded and then can not take advantage of new higher resolutions as they come out. My proposal would be that instead we reports max-width, max-height, and max-fps. The camera reports the max it supports in any mode even if they are not available together. For example, a camera that can do WVGA at 120 ftp but 1080p at 30 fps would reports it max-height at 1080 and it max-fps at 120 even though it may not be able to do both at the same time. Similarly, if several camera were attached to the browser at the same time, it would report a single max that represented the max across all the camera. This may sound very limiting but is substantially reduces the complexity of the API, reduces fingerprinting privacy concerns, and still meets the use cases I heard about of trying to render reasonable user interfaces given the possible capabilities of the machine.

I respectfully disagree, somewhat.  Reporting is one issue, but for 
selecting I want to be able to give priority to either resolution or 
frame rate, or if we think more control is needed, a target minimum 
frame rate.

Generally my experience is for person-to-person calls is frame rate 
(especially *consistently high* frame rate is more important than 
resolution.  I really, really want to see 25-30fps, and a steady rate, 
not one that dips every time someone talks with their hands, or adjusts 
their chair).  Now, different apps (and different users/use-cases) have 
different needs, so the main selectors I see are: minimum frame rate 
(request, not an absolute limit), favor resolution over frame rate or 
vice versa, and maybe maximum resolution.

Note that separate control of (encoded) bandwidth is required which 
feeds back through the mediastream to change capture parameters as 
needed while capturing!  This is not necessarily part of the JS api per 
se, but if you hook a mediastream up to an encoder/sink (like webrtc) 
that needs to be able to adjust parameters of the capturing device.  If 
that means the API needs to be defined here (and it may), then we need that.

>
> For audio: I like the music and speech. I'm far more dubious about fm, studio, etc … I will note that I can not imagine any browser that does not have the capability for both music and speech if it does audio at all so I sort of wonder about the value of this as a capability. I do see it needed as a hint.

I agree.

> For video: I like action for temporal optimization and I think the we could call the spacial resolution "detail" or something like that. Again, not clear that the capability is here - this is more of hint.
>
> I don't really get what face2face would be but I think it is worth be able to indicate that something is interactive media vs streaming media. I'd expect the browser to enable echo cancelation and such for interactive.

Where is this API/description being used?  If this is for 
GetUserMedia(), then the things you mention are higher-level constructs.

Another issue here is "can there be multiple audio or video tracks" in a 
MediaStream?

>
> The whole bandwidth thing I am confused on. Assume we even had any idea what "broadband" bandwidth was, I sort of doubt that browser would be able to reliably figure out if it had it or not before it was sending media.

bandwidth is more an issue (from capabilities) of the power of the 
encoder and decoder.  It's hard to specify that without knowledge of the 
codec used (and even then, it's more an issue of maximum resolution than 
maximum bandwidth, so I'm quite unclear on the meaning or utility of this.)



-- 
Randell Jesup
randell-ietf@jesup.org

Received on Saturday, 21 January 2012 12:39:52 UTC