Media Capture Task Force teleconference

09 Feb 2012


See also: IRC log


Daniel_Burnett, Doug_Schepers, Frederick_Hirsch, Harald_Alvestrand, Josh_Soref, Rich_Tibbett, Stefan_Hakansson, Travis_Leithead, anant, darobin, derf, dom, fluffy, jesup
Harald, Stefan


<scribe> scribe: Josh_Soref


StefanH: the proposed agenda
... Capabilities and Privacy
... and Scenarios

Capabilities and Privacy

StefanH: I've asked burn to introduce this topic

burn: so, what I'm going to do is give a brief summary of what we've talked about
... we may have to review some of the arguments we were given
... a number of people were talking about capabilities
... and also about fingerprinting
... what we did at the last WebRTC meeting
... was discuss capabilities
... I'll jump to the conclusion of that really long discussion
... there was a consensus that capabilities information was needed
... to build applications with WebRTC apis
... for those unfamiliar with this
... it may seem obvious that this was necessary
... but it wasn't
... we tried to decide initially whether we really needed it
... there are applications like Presence
... where you need to know the means through which you can contact people

<dom> Discussions on capabilities and fingerprinting at the WebRTC F2F last week

burn: for a Dating site, you may want to give preference to those with a WebCam available
... that's one example, there were others
... there was a conclusion that capabilities were needed for the apps people would like to build
... as part of this discussion, there was a discussion about the level of capabilities
... do you just want high level, or details
... we didn't really come to a clear consensus on the level
... the richer you want the application to be
... the more details you want about the client
... the details will depend on how much you trust the application
... most people generally trust an application they've locally installed
... we might want to consider how much capabilities to expose
... based on how much you've dealt with a site
... e.g. giving more details once you've logged in
... for a downloadable-installable-webapp
... essentially a shell installed on a machine
... in that case, you may not need to ask about using mic+camera
... even though if you were to visit the site live you would need to ask (same general app)
... I'm sure there will continue to be a discussion on this
... some people proposed...
... considering Google Hangout
... and try to design to support for that
... some wanted to do other things than that, for which they want more capability info
... we didn't reach a consensus on that
... that's probably a high level summary
... one other thing
... I didn't mention fingerprinting above
... but we did discus it quite a bit
... it was brought up that there are already many bits available for a user w/o this
... sites may eventually be able to uniquely identify you
... they know you bought item X at a store
... because they happen to have sufficient information about you
... not because they're using linking cookies
... this is why it was brought up
... there are sites that you trust
... sites like Facebook may already know who you are
... for some sites, you may not be giving more information
... ok that's about it

<jesup> Too much information makes things like private browsing not private - can even be dangerous in some cases

burn: any questions?

fjh: Question about other privacy concerns
... use of the camera/audio
... without the user knowing it

burn: there has always been a requirement about notifying users of use
... this was just a discussion about when things are being used
... any other questions

anant: comments
... I think it makes sense to have levels about how much capabilities are actually exposed
... about media being transmitted
... browsers are not the only HTML clients
... while browsers should notify users that cameras are on
... it might not make sense for a smart phone to tell a user that
... you might want to write a dialer in HTML5 using getUserMedia
... but it doesn't make sense for the dialer to have to ask for permission or be limited in the data it gets about the hardware
... that's where defining two separate levels is useful
... then the useragent can decide based on the use case about what to disclose
... for a browser / random web sites, that's less
... and for a telephone, the UA could disclose more
... the spec can define both
... and the UA can decide which class to apply for the app

burn: there wasn't necessarily consensus on 2 levels
... there was interest on multiple levels
... maybe more than 2
... anant was interested in 2
... I think it might be a good approach to have multiple levels
... any other questions?

fluffy: Cullen here
... one question I had was about notifications about capabilities changing
... let's say you had permission to know about capabilities
... and someone plugged in a camera

burn: we didn't get consensus on that
... we didn't have time to discuss in detail how that would work
... how frequent, or how it would look

anant: some part of that will be tied to what levels of capabilities we have
... but when they change, it makes sense to notify at the same level
... I don't think there's a risk of notifying within the same level

burn: I agree

StefanH: I've heard a lot of people speaking from WebRTC
... I'd like to hear feedback from the DAP people

harald: focus on what parts are different

burn: what important things did we miss?

fjh: one question is: how do we establish this trust
... one way is how to establish trust
... installing an app is one way

burn: we didn't get to that

anant: my proposal is to let the UA figure it out
... I think we should leave it up to the UA to decide

fluffy: I have some worry about this
... I understand that things will be different for certain environments
... we should say "if you're a browser, you should probably do this"
... it's not like the standard police will go after you

anant: right
... for Chrome and Firefox we have means to escalate privelege
... and IE
... I don't know if Travis has comments on that
... I think all three have a way to say that

[ Scribe explains that you need to introduce yourself ]

Travis: it was brought up about Windows Metro apps
... and capabilities / permissions
... while I think it's good to describe the options for UAs
... it might be useful in our spec to categorize these things
... one categorization is:
... is this Implicitly trusted
... is this Implicitly untrusted
... determining how frequently you want to ask / what mechanism, is too far

harald: you're not talking about Trust in the UA

Travis: I think that's what I'm addressing
... from a high level point of view
... if the UA is a Phone
... I think there's a point at which I grant permission

<dom> [DAP has had similar discussions many times; but unless we have a good story around trusted/untrusted, I think we should just focus on untrusted for the time being]

Travis: and that can be done when the hardware is built
... if the UA is a Browser, and it can be exposed to untrusted apps
... then it's a different category

fjh: I had a question about the earlier statement
... about the browser being able to know based on the url
... is there an example?

anant: it's not based on the URL
... for Chrome, you can go to the Chrome web store
... and install an application
... that's the mechanism of establishing trust

harald: are we finished addressing capabilities?

burn: one other question
... cases where trust might be improved over time
... we've covered distinctions between UAs that may or may not be trusted
... and apps that may or may not be trusted
... but there's this other level
... say I'm using a Web Browser
... and I'm using some online application
... that I did not install
... that doesn't mean that I might not want to have the ability to gradually improve the privilege of the app
... I may not have a problem with an app knowing that I have a camera/microphone

<dom> Installing web apps, Robin Berjon, Feb 7 2012

burn: when I was expressing concern about two levels
... it may be convenient for me to say "I trust this <app|site|domain>"
... I'm saying there may be "trusted-noninstalled-apps"
... it may be another category

anant: by category I'm talking about the information provided
... not about how the category is established
... installation need not be the only way to establish a category
... we need to provide another way
... I don't know immediately what those ways might look like

fluffy: do we want to talk at all about what capabilities could be discovered?

<jesup> Perhaps a request for added trust causes a geolocation-type doorhanger

fluffy: for the trusted ring, I don't think we need to hide anything
... then there's the question about how simple we want the JavaScript api to be
... we don't need to limit for fingerprinting
... overloading...

burn: I agree that we can hide more detailed in a tiered object
... for apps that we're interested in, we definitely want everything to be available

harald: an expandable set

Travis: I'm a little concerned about how that principle leads this door right open
... how do we specify *everything* about a device
... I come from the security camp
... I'd like to expose the least amount to get by

<dom> +1 to Travis

<fjh> +1 to least priviledge

Travis: I'd like to iterate over time about what to expose

burn: if I have an app on my desktop

<fjh> that is the problem with desktop apps, and viruses etc

burn: it knows everything about my device
... if I try to build something in my browser
... I want to be able to do everything I could do with that app

Travis: for each device, the details are slightly different
... having to read up on every driver's technique for exposing information is painful

fluffy: for desktop applications, there isn't a well defined api to get most of the information
... I don't know burn, I don't think it's really what you think
... I think concrete examples will be helpful

burn: if I want to build something as rich as what I can have on my computer
... whatever I can get from my desktop is sufficient
... I'm not asking for more than that

fluffy: I'll want more

<Zakim> Josh_Soref, you wanted to talk about device discrimination

Josh_Soref: my concern is that I'd like to be able to even buy an application and use it on one device
... and then use it on a different device with different parameters, assuming the application is transferable
... the app shouldn't stop working because the app would be hardwired to a specific user agent or device
... I appreciate the ability to have precise details or about general things, I want to make sure we don't make it too easy for apps to be hardcoded for some specific hardware configuration

harald: I'd like to move to the next topic

anant: can we send out proposals for how to deal with Levels and moving from level to level?
... burn can you send in a proposal?
... I'm happy to send in one as well

burn: I'm willing to send in one, let's talk offline


<Travis> http://dvcs.w3.org/hg/dap/raw-file/tip/media-stream-capture/scenarios.html

Travis: I was asked to put together a Scenarios document for Media Capture
... the goal of the document was to establish the set of realistic scenarios
... that lightly touch on the things you might want to do
... with a capture API if you had one available
... I tried to spread it across both browsers and installed applications
... I didn't want to write this as a requirements document
... I wrote it for an Average user
... not one with 30 cameras
... I had a few sections: Scenarios, and Rants and Raves
... [Section 5]
... We at Microsoft have experimented with things
... I have commentary
... Section 5 as it is in the document is what I consider less important to the scenarios
... but still worth considering

<fjh> some comments I offered on the scenarios document -> http://lists.w3.org/Archives/Public/public-media-capture/2012Feb/0012.html

Travis: there are 6 scenarios
... I can talk through them if the group wants

StefanH: maybe you could go quickly through them

Travis: there are 6 scenarios
... the first scenario is a user, like a Facebook user
... using it to capture an image from their webcam
... the app provides a ui, that enables the camera to turn on
... and capture a single image
... the ability has the ability to capture audio
... for an audio caption
... maybe converted to text and put on the web page
... the second scenario is about an enthused user
... more active in Blogging
... this is a Video Blog
... and he goes and describes his commentary on the Political situation
... he has some friends who will join him
... they will initiate connections to certain parties
... who will join in at the end

fluffy: when you talked about switching browser tabs

Travis: a detail on the text in the scenario
... he's watching himself do the recording in the browser tab
... he may need to switch to see some other data
... it was meant to say that the ability to have the camera running in the background is important to consider
... the conferencing scenario needs to keep the conference up
... the third scenario is a student working on an image processing course
... the teacher has put specific requirements on the size of the video for the assignment
... the student has to ensure the video fits the parameters
... she can't operate on a large video
... she may not need to trust the site
... fourth scenario is a mobile scenario
... a user traveling
... recording a video diary
... introduces a front and back camera
... record what he can see, and record himself
... and switch back
... fifth scenario is a Conference call
... there's a scribe recording the conference
... there are participants participating via their own browser
... this scenario lets the scribe put the call on hold


Travis: to review the video and catch up on scribing
... I know Josh_Soref would like that, so I put it in there

Josh_Soref: thanks

Travis: the sixth scenario is privacy
... you're browsing and stumble upon a page which asks for permission to use your camera
... and you deny it
... I captured these 6 scenarios
... I think it covers most things
... I'd like to base our api around allowing these cases
... if we want to scope down from this, that's ok

StefanH: thanks
... this document has had a lot of reviewing
... chairs would like to propose moving it to FPWD

fjh: I had a couple of comments
... one that I'd like to go in before
... I think that capturing a media stream in section 5 is another scenario
... I think it warrants another scenario

Travis: I saw that fjh and thank you for posting it


Travis: I will try to address that comment before we publish

fjh: thank you
... the other thing is what to do when multiple applications try to access the camera
... I'm not saying it should hold up FPWD
... but I think it's an important concern

Travis: thank you, and I agree with that
... that's going to be interesting
... I might have multiple web browsers
... and an OS limitation might only allow one access at a time
... so if I have 2 open, and they both try at the same time
... it can lead to amusing results

StefanH: if no one here objects, we will ask the WebRTC+DAP WGs
... and then will move it to FPWD

dom: I think we should make a short CfC in both groups

StefanH: fjh will you take care of that for DAP?

fjh: I'm not sure I understand what you're asking

dom: formally WGs need to approve requests to publish drafts

fjh: sure, I can do that
... and if Travis includes the change, that would be great

Travis: ACTION: Travis to integrate fjh 's capturing a media stream as a scenario

<trackbot> Sorry, couldn't find user - Travis

<fjh> ACTION: fjh to send CfC to DAP re FPWD of Scenarios document upon Travis edit [recorded in http://www.w3.org/2012/02/09-webrtc-minutes.html#action01]

<trackbot> Sorry, couldn't find user - fjh

Audio WG request to join TF

shepazu: Doug Sheppers
... I work for W3C
... I'm the Staff Contact for the Audio Group
... I want to make people aware that people in the Audio group are interested in participating
... we have things we'd like to put forward and discuss
... particularly about multiple line ins
... and possibly MIDI bits

dom: we have two options
... we can convert the TF from 2WG to 3WG
... or we can ask the Audio WG to send inputs to the TF as is

shepazu: I think the lightest weight thing we could do right now
... is for us to send you our use cases and requirements
... I'll talk w/ dom off list


shepazu: I'll see if anyone feels the need to actually join the TF
... we're really happy to see this work moving forward
... I don't know if particular individuals
... need to call in or need to participate
... or if they're happy to let the TF do the work

harald: the reason for the TF is because of constraints
... are there any participants in Audio who are not in WebRTC/DAP?
... dom: don't you have a tool to look for overlap
... I know that Apple / Intel just joined
... they might have a problem
... if you could drop a note to Audio asking if people would have a problem joining the TF, that'd be good

<dom> Overlap between Audio and WebRTC: Opera, Intel, Mozilla, Google

shepazu: dom maybe you and I can talk offlist
... and I'll communicate back

<harald> I think we're looking for the disjunction - orgs in audio who are neither in dap nor in webrtc

<dom> Overlap between DAP and Audio: Opera, Intel, Mozilla

API Document

anant: status...
... we have a document that's published
... and we have some implementations

<Travis> http://dev.w3.org/2011/webrtc/editor/getusermedia.html

anant: and some demo applications
... we've also heard some feedback from web developers
... on how they they think the api works for them
... and I think we have some work before we publish a WD
... some issues we need to cover
... Capabilities and Privacy
... related to Capabilities/proposals
... we didn't reach a solution on Hints
... that's issue #1

<dom> (the disjunction is thus Apple, BBC and INRIA)

anant: I don't know if you want to discuss point by point

issue #1

anant: is Media Stream Options
... which is not really defined as a hint
... currently the call should fail if you ask for Video and don't get it
... we might want a way to provide a hint

issue #2

anant: is extensibility
... it seems like WebIDL Dictionaries make sense
... we need a procedure to later extend the options, capabilities, or hints

issue #3

anant: events versus callbacks
... most new APIs prefer not to use Callbacks
... when you're writing multiple pieces of code
... using addEventListener seems to be the standard way
... so multiple pieces of JavaScript code can get callbacks

<dom> +1 on using EventTarget

issue #4

anant: what the behavior should be when the api is called in an iframe
... currently the US-Geolocation API ignores calls in iframes
... my proposal is to do the same for video

<dom> +1 on limiting getUserMedia to top level page (until we have a better model for permissions)

issue #5

anant: we need to define the behavior for Browsers, as fluffy mentioned
... UI recommendations
... text like "we recommend the UA notify the User when the Camera is transmitting"
... "or the microphone is turned on"
... I don't know how to do this in a consistent way

<shepazu> (I agree with Dom, but can see use cases for delegation, especially with mashups or service reuse)

issue #6

anant: we need to specifically define what happens when you switch away
... it's clear that we need to support it
... if I call getUserMedia
... and I switch away
... sometimes I want it to continue
... and sometimes I don't
... should getUserMedia fail when I switch away
... there are also privacy implications
... I think all should be tackled (considered?) before we move on to a WD
... that's all I had

<dom> (if camera selection is done via the chrome, I think the browser should be responsible for alerting the user of conflicting pages asking for access to a given camera)

anant: getUserMedia
... takes 2 options
... there was a proposal for adding hints
... we've decided we need capabilities
... I think we need an api for capabilities

<dom> can't we get optional video stream with {video: {optional: true} } ?

anant: we can't tie it to getUserMedia
... because you need it before you make the call

harald: reminder to give names before speaking

Travis: I've taken both sides of the stance
... I'm divided against myself
... originally
... before you request getUserMedia
... you want to know if the device supports recording
... because I don't want to promote a pattern
... where you continuously hit getUserMedia probing for stuff
... try for Video+Audio, fail, ok, lemme try for Video
... in that sense, I like knowing about things in advance
... At the same time
... I'd like to call the api with what I need
... and then be informed when it's ready
... or call with my baseline
... and then have a way to adjust my requests
... rather than having to reprobe

dom: Am I supposed to talk only about issue 1?

StefanH: I think they're all tied together

dom: ok
... anant you were listing all these issues
... I don't see them in the draft, and I don't think they're in tracker either

anant: yes
... I just made them up
... I'm happy to convert them to tracker
... is Tracker Tracker or Bugzilla?

dom: let's talk about that offline
... hints v. options
... I'm wondering if we can't have in the option object
... if you could have required:true/optional:true
... I think we have a way to design optional arguments there,
... personally a group of you were at the meeting last week
... I'm a bit uncomfortable with getting abilities
... as it seems to lead to a great privacy breach
... initially, I'd like to design getUserMedia without assuming there would be a capabilities api
... otherwise I agree supporting limiting getUserMedia to the top level page
... and I support moving to addEventListener
... that covers most of the issues

richt: Rich T from Opera
... I have feedback on all of these issues
... we have done some prototyping
... so let's start with hints
... we're really excited about hints
... we don't particularly want the capabilities api
... with hints we can say "in an ideal case, this is what I'd like"
... and the UA will try to match up
... and then fall back where it can't
... hints are a much more web model
... look at CSS
... you request something
... and if you don't get something, it will fall back
... I'm not so particular about audio/video
... I'd like to reverse from "we have capabilities" to saying "we have hints, do we need caps"
... on iframes
... we intend to do the same thing as Chrome
... create a pair, so that an iframe can only have access if there's a trusted chain to the top
... re EventListeners, there are some benefits
... to the current way
... but there are issues
... we may want to experiment
... the UI recommendations for browsers
... we've discussed this as well
... @TPAC, the Firefox/Mozilla guys
... anant demod
... the UI seems to be converging
... we may want to codify the challenge
... we may not want to lock things in
... I think that's about all I wanted to say
... one last thing...

<Travis> Opera's UI experiment for getUserMedia: http://dev.opera.com/articles/view/getusermedia-access-camera-privacy-ui/

richt: switching away from the application
... you may be able to pin a tab to give it that priv

fluffy: hints/capabilities/iframes/perms
... on hints, we definitely wanted the ability to select the capture resolution
... I'm willing to look at a proposal without capabilities
... all applications are working in environments with multiple cameras
... I'm not sure how to deal with device selection

<dom> [device selection can be made via the browser chrome]

fluffy: I realize that whatever we do with capabilities will be
... a huge fingerprinting
... issue
... but without turning off third party cookies
... the iframe issue
... historically iframes were supported
... e.g. iframes for ads
... but it's important for mashups
... I liked the opera proposal

<shepazu> +1 for importance of mashups

fluffy: I think it's important to be able to run in an iframe
... the issue w/ running in an iframe
... is you need to know who's getting the media

<StefanH> +1 for iframe's

<shepazu> (might allow delegation and transparency, similar to CORS)

<Travis> +1 for iframe

<richt> fyi, geolocation permissions for iframes: http://lists.w3.org/Archives/Public/public-geolocation/2011Feb/0001.html

<richt> ...and we'll be fixing our behavior for geolocation in this respect in our next release.

fluffy: we had a long discussion at the interim WebRTC meeting
... if you mark stream as No-JavaScript
... then only whomever is on the other side of the stream
... the important thing for Permissions is "who has access to the video"
... which isn't necessarily the web site

<shepazu> (once the stream goes to any server, that server might send it anywhere… some degree of trust must be there)

<dom> shepazu, actually the protocol allows to make it so that the server doesn't get to see the stream at all; but it would probably need help from the API

<jesup> shepazu: for webrtc, often the stream is peer-2-peer, encrypted, never seen by server

<shepazu> (but is this only for WebRTC? for an audio-editing site, stream might be sent to host server)

<jesup> shepazu: some uses it goes to server, some it doesn't. And a user may want to record locally without exposing to the server

<jesup> shepazu: If I'm recording locally, I don't want the app to be able to surreptitiously send a copy to the website (at least not unless I 'trust' them)

<shepazu> (jesup, agreed, but there are multiple scenarios to meet here)

<jesup> shepazu: See the meeting slides from the W3C WebRTC interim on MediaStream Security

<jesup> shepazu: hang out for a sec while I dig it out

<jesup> shepazu, http://www.w3.org/2011/04/webrtc/wiki/images/a/a3/MediaStream_Security_1.pdf

harald: [ removing chair hat ]
... I liked anant 's idea of going to a different model for getUserMedia
... I think we need a different name
... I'd like to encourage anant to write up a better proposal with a different name closer to javascript style
... in previous discussions
... we had the idea of capabilities to turn around and feed them back in
... I'd like to investigate that somewhere too
... we had pretty strong support for capabilities
... it isn't about going down the slippery slope
... but how far

burn: +1
... I think there's a need for capabilities
... as I summarized at the beginning of the call
... there are definitely cases like Presence for Dating sites
... first you're going to check capabilities
... check if devices are plugged in
... then you try to get access
... it may not be available "now"
... and that's entirely reasonable
... especially with competing applications
... w.r.t. hints
... my concern is that
... application authors like knowing what they're getting
... I understanding how HTML works today
... and CSS allows for fallbacks
... but I would argue that most web app authors do not find that a feature
... they consider it a bug
... they acknowledge it, but constantly push for more control
... when I was working on Voice XML
... people pointed out that HTML when it first came out was a step backward for Desktop Publishing
... but HTML has evolved
... and authors would do things like using Flash to get more of what they want
... what I'd like for hints
... is to specify "ideally I'd like this", "but these other things are acceptable"
... for video, maybe I care more about width than height
... maybe I care about aspect ratio
... if I can't get these characteristics, then I don't want anything

richt: I like that
... hints for me aren't necessarily quality related
... looking at cameras, hints specify a preference for which camera
... the one I'd like

StefanH: as chair, we're running out of time
... personally I have a hard deadline
... shepazu and I can take it to the list

[ Group agrees to move offline ]

Travis: a response to hints/capabilities
... but I think this may be best served as a list discussion
... and maybe proposals written down

richt: hints is privacy preserving
... you're requesting something, rather than knowing what happens
... I'm designing for the web case
... trusted fine, we can install/elevate
... but hints work much better in the untrusted environment

Josh_Soref: +1

StefanH: we have to move this discussion offline

Next Call

Travis: I'd like to ask for another Teleconf
... and put out when we'd have it

StefanH: yes

harald: and we'd want to do that before the March IETF meeting

<burn> agree with having new call

Travis: 2 weeks would be good

<dom> [please avoid overlap with Mobile World Congress if possible]

<harald> dom, what are the MWC dates?

<anant> yes, lots of people at MWC

<dom> Feb 27-March 1st

Action Items

anant: EventListener

StefanH: anant had an action to write up a proposal

fluffy: burn and I have an action for capabilities
... one's easier and one's harder
... (goes in and comes out)
... anant: you and I should work on the first

StefanH: the last action was Travis to update the draft
... and then the WebRTC/DAP chairs will send out CfC

trackbot, end meeting

Summary of Action Items

[NEW] ACTION: fjh to send CfC to DAP re FPWD of Scenarios document upon Travis edit [recorded in http://www.w3.org/2012/02/09-webrtc-minutes.html#action01]
[End of minutes]

Minutes formatted by David Booth's scribe.perl version 1.135 (CVS log)
$Date: 2009-03-02 03:52:20 $