RE: Comments on scenarios doc - 9 Feb Version from Travis Leithead on 2012-02-10 (public-media-capture@w3.org from February 2012)

From: Travis Leithead <travis.leithead@microsoft.com>
Date: Fri, 10 Feb 2012 20:38:03 +0000
To: "Frederick.Hirsch@nokia.com" <Frederick.Hirsch@nokia.com>
CC: "harald@alvestrand.no" <harald@alvestrand.no>, "public-media-capture@w3.org" <public-media-capture@w3.org>
Message-ID: <9768D477C67135458BF978A45BCF9B3838213B7D@TK5EX14MBXW603.wingroup.windeploy.ntde>
>-----Original Message-----
>From: Frederick.Hirsch@nokia.com [mailto:Frederick.Hirsch@nokia.com]
>
>Some additional suggestions
>
>(1) I suggest we move section 5.10 capturing media stream to be a new
>subsection in section 2, as it is a major use case.

Section 5.10 is just general commentary on media stream capture. The points I make in that section are actually incorporated into all the scenarios of section 2 already. For example, the four bullets in that section actually map to scenarios 2.2, 2.1, 2.1, and 2.5 respectively. Since section 2 is all about the complete end-to-end scenario, having this single bit about capture doesn't make sense to me as suggested.

Is there something specific about section 5.10 that you would like to emphasize as a variation?



>(2) section 2.1, hat scenario
>
>Add (after list 1-7)
>
>"Note that the permissions Amy has given are not persistent , so every
>subsequent use of the webcam or microphone will require Amy granting
>permission.
>This is to avoid the possibility of unauthorized camera or microphone use
>after Amy finishes sharing her hat information."
>
>(This is in contrast to 2.2 Election podcast which assumes persistent
>permission).

This sounds too much like a requirement statement rather than part of the scenario in my view. Rather than add the text as you requested, I added a parenthetical to re-enforce the idea (which I agree with) inline in the scenario. I also specified this idea in the first bullet.



>(3) section 2.2 election podcast
>
>are the persistent permissions based on time, or just general permissions?
>Is this permission aspect in scope for the API work here?
>
>Additional item  "Schedule by time period video and audio capture
>permissions."

I'm intentionally not clarifying how the permissions are managed, as I'm a strong believer that different UAs will want the freedom of managing the experience differently. What I did want to convey via the scenario is that there is such a concept as persisted permissions.

I believe it is definitely in scope of this TF to think through the permissions model and how it relates to getUserMedia. Based on these scenarios, I think that generally-speaking there are cases when you want to establish least-privilege trust, and cases where you want full-trust, and there are probably many gradients in between.

This is directly pertinent to the ongoing issue regarding capabilities vs. hints in the API.

What I don't want to see happen in the spec are requirements regarding permissions that tie the hands of implementations with regard to how they want to do permission management. I would like to see some UA guidelines and I would like to ensure the API is designed with those guidelines in mind such that the right amount of privacy can be maintained under these various scenarios.



>(4) section 2.4 video diary
>
>Add
>7. Integration of video audio capture with battery status to enable save
>and termination.

Added.


>On 02/09/2012 12:00 PM, Randell Jesup wrote:
>> On 2/9/2012 10:10 AM, Frederick.Hirsch@nokia.com wrote:
>>> (5) 2.4.1.2 picture-in-picture
>>>
>>> Is this really picture-in-picture, or capture of multiple time-sync'd 
>>> videos that can subsequently be edited? Sounds like the latter.
>>
>> I think Travis added this after I commented on the list; in any case I 
>> was driving towards synchronized capture from both cameras, not 
>> locally composited before saving. You could make an argument that 
>> local compositing and recording multiple streams are equivalent from 
>> this view of the spec (in terms of requirements), but I think they're 
>> different in how users understand them.  If the requirements work out 
>> the same, I'm ok with it.
>The reason I suggested picture-in-picture as a variant was that having 
>both exposed "record multiple streams" and "manipulate streams and 
>record the result" as two separate requirements.

Since we already have 2.4.1.1 "simultaneous recording from multiple webcams" (which I think covers the ability to start two+ isolated recorders at the same time (on supporting hardware), my view for 2.4.1.2, now renamed "capture a composed video" is about making one capture of a video that is composed of two webcam's sources. 

There's no details here as to how this is done (for example, it might be bit-blitting the preview of each webcam onto a Canvas by way of post-processing, and the recording the canvas element; it might be done by media stream compositing API). I think we should keep an open mind. I also think we should spec the minimum functionality required to meet the scenarios even if it means leaning on other web platform features to complete the scenario.



>(6) Add to 5.1.1 privacy
>
>Add
>
>"In addition, care must be taken that webcam and audio devices are not able
>to record and stream data without the user's knowledge.
>Explicit permission should be granted for a specific activity of a limited
>duration.  Configuration controls should be possible to enable age-limits
>on webcam use."
>
>I'm not sure how to address concerns about age and granting permissions, I
>believe there have been some "child exploitation center" concerns noted in
>the UK regarding
>children being mislead into using video inappropriate. Possible issue.

Thanks. I added that.



>(7) 5.1.2 privacy Issues
>
>add
>
>4.  Enabling control configuration of webcam based on age?
>
>5. Phishing and other attacks using webcam, audio (possible issue to note)

Great considerations. I added them.


>(8) section 5.4 stopping
>
>rather than muting, isn't the alternative to "pause" the capture?

Pausing works better. I made the edit.


>(9) 5.5 pre-processing
>
>support of something like "zakim, mute george" seems very valuable.
>
>gain control might be tricky if it enhances unwanted background noise on a
>specific un-muted input.
>
>Sounds like there is enough work in v1 to suggest not including video pre-
>processing, as is suggested in issue 1.

I agree. End-pointing/level detection would allow the "zakim" scenario to work. No spec change made.



>(10) 5.7.1 privacy
>
>Does  [[A selected device should provide some state information that
>identifies itself as "selected"]] mean that I could write a monitor app to
>determine when the
>webcam is in use and by which apps? (Or should my request for a device
>simply fail without reason if in use?
>this goes back to the earlier issue of multiple web pages wanting
>simultaneous access and how to handle it,

Upon re-reading this paragraph, I couldn't tease-out the point I was trying to make. Since 
it is leading to confusion I just removed it.



>(11) Seems that issue of multiple web apps desiring simultaneous access, or
>interrupting access to a device, warrants separate section and detailed
>discussion.

I'd love to hash this out more. Right now, the topic is located mostly in section 5.2. I think as the TF works on the hints/permissions issue, that this may come up again and perhaps have an elegant solution.
Received on Friday, 10 February 2012 20:38:48 UTC