- From: Dominique Hazael-Massieux <dom@w3.org>
- Date: Fri, 08 Feb 2013 17:26:01 +0100
- To: public-media-capture@w3.org
Hi, The minutes of the Media Capture Task Force F2F meeting on February 5, 6 and 7 are available at: http://www.w3.org/2013/02/05-mediacap-minutes.html http://www.w3.org/2013/02/06-mediacap-minutes.html http://www.w3.org/2013/02/07-webrtc-minutes.html (first two agenda items) and copied as text below for indexation. Dom ============ Media Capture Task Force F2F Meeting 05 Feb 2013 [2]Agenda [2] http://www.w3.org/wiki/Feb_5-6_2013 See also: [3]IRC log [3] http://www.w3.org/2013/02/05-mediacap-irc Attendees Chair hta, stefanh Scribe Josh_Soref, Ted_Hardie1, ekr, fluffy Contents * [4]Topics 1. [5]Error Handling 2. [6]"immediate stream" gUM 3. [7]Device reservation * [8]Summary of Action Items __________________________________________________________ <trackbot> Date: 05 February 2013 <dom> ScribeNick: ekr martin: proposes slides... <stefanh> Martin's slides: [9]http://www.w3.org/wiki/images/7/7f/Device_Enumeration.pdf [9] http://www.w3.org/wiki/images/7/7f/Device_Enumeration.pdf <dom> [10]Device Enumeration presentation [10] http://www.w3.org/wiki/images/7/7f/Device_Enumeration.pdf requirement: two different sites get different identifiers <dom> ScribeNick: dom ekr: what's wrong with just numbers? burn: they can't have the same meaning across sites [several]: they may be the same, but there is no guarantee they are burn: it's important that browsers don't implement it in the way that in practice, people can rely on it <scribe> ScribeNick: ekr martin: monotonically increasing is a management exercise per browser. juberti: this is all without any permissions. a site can find out how many audio and video devices you have? martin: yes ... are people comfortable with the privacy properties and is this a valuable function? fluffy: this is the right thing to do, but I think you also need to be able to ask for a human-readable string that might be used to identify the device. ... this adds fingerprinting surface. martin: privacy-preserving option would be to make this available to the site only after granting consent. fluffy: is thete a way to get permission to all cameras? <Ted_Hardie> ScribeNick: Ted_Hardie fluffy: is there some easy way to ask: I want permission for all cameras? Dan: I don't think that's necessary cullen. The browser is the one who actually knows. ... the application requests a source id; the browser has the opportunity to name it to the user. ... one of the values became clear to me at WEBRTC expo. A user asked about setting up medical devices. ... The nurses won't know about this, and this would provide them the right info. Justin: is this the right approach? Hey the browser should have a way to know and expose this (the user selects in the browsers) Martin: this is a poor user experience. Justin: do you have a use case in mind? ... We need to have a strong use case that justifies this. Dan: This is a constraint—if it is not satisfied, then it goes back to the default. Justin: We need to have a strong use case that justifies this if we are going to do this work; if Room: HAAAH (Speaker excitement) Justin: I get the hint. ekr: first of all, I can't tell you how much I despise every new (mumble) gets a new name. We can't have a smellovision constraint. fluffy: the current API does have a method for getting all the types. There is another "tell me about everything" intended here. room: can you clarify, ekr? ekr: I hate that we enumerate only audio and video and those separately. If we come up with something new, like smellovision, testing for it using this system will be painful. ... I do think this functionality is needed. Desktop applications do provide "select camera" functionality. We don't want to go out to OS or browser to get a preference dialog. Martin: to be clear about interrogation: after the consent only. ekr: fluffy suggested something different <fjh> +1 to ekr re ability to control which device from application for usability; <dom> ScribeNick: ekr abr: is this functionality we need? I would be unhappy if I always have to select the camera on web sites. jesup: certain applications are interested in different devices. you don't want to have to flip the browser or OS default whenever you switch applications. giri: unclear how you implement the unique ids. martin: plan was to run the GUID through a hash in order to generate the site-specific mappings ... one global browser secret. giri: how does the user clear out the mapping. martin: when you clear cookies. giri: native apps? looks like an uninstall/deletion event. fluffy: in the name of protecting privacy, we have constructed a situation where every app will ask for full permission. ... this is what happens if we don't give you any access before full permissions. martin: I don't have any research on that. hirsch: this is like cookies? [yes]. What about correlation across sessions and across users. <dom> [re use case of device enumeration, it seems to me that Google Hangout exposes available devices in its UI for the user to select FWIW] hirsch: needs to be more details in the doc. juberti: the proposal is that initially you can only get ids, but then once you have permission for a device you can interrogate its properties. ... what about access to the names of the other devices. ... the microphone about camera and microphone seems more sensitive ... this seems like the right compromise barnett: what if the user could pass in a usable string... <fjh> medical device case makes it clear that privacy risks and mediations need to be enumerated <fjh> apart from fingerprinting knowledge of names of devices seems benign, or am I missing something? clarification on the proposal: the app gets to provide a string that the user can use to choose. ekr: users ignore all the information in the choosers jesup: app can explain anything it wants outside of the picker. <fluffy> "Bad guys would use this for more than the good guys" - Martin ekr: so new devices that were conceptually like cameras and microphones would be interrogated this way juberti: once you had permissions you could ask for anything? fluffy: trying to make certain we leave with a decision ... You can get all the device IDs and then they are stable. <dom> [not so much "site" as much as "origin"] fluffy: Once you have the device IDs you can ask for type. ... Given a device ID, you can ask for the device subsequently juberti: so how do I make the picker? Given that I don't have access to other devices <dom> ScribeNick: Ted_Hardie fluffy: how much other information are we now providing without consent? Justin: Do we return this information off what we get from this or from getDevices? ekr: one thing people talk about is allowing the javascript to ask for camera and mic, but in a restricted domain. ... This is bound to a different domain, so it can't be seen by the javascript. I think we need to be clear whether we believe giving people access to one camera gives access to all of them or whether we can let them see what the camera sees without "getting access to it", so we can build this picker. <ekr> to clarify my statement at the microphone, you would probably need to have that request not require a permission grant. <dom> ScribeNick: ekr ted_hardie: mobile cameras often point in different directions. <dom> [I think the first requirement for a good proposal is to be an actual proposal :) ] fluffy: need to be able to grant access to some cameras and not others. don't want to get dialogs for each camera. paulk: is it possible to have two levels of permission? giri: are we considering changing the requirements to potentially accomodate this proposal harald: the chairs don't think this is inconsistent with the requirement giri: it seems inconsistent to me. hta: we discussed this at the last meeting. the number of devices isn't sensitive juberti: you probably want some sort of callback if the list of devices changes. martin: polling? juberti: what's the rationale for not exposing a callback? Josh_Soref: do you have WebIDL for this API ... is it mutable or fixed? hta: closes mic <gmandyam> From Giri: the requirement P5 in the use cases and requirements doc will not allow for human-readable descriptions of devices not available to the user, but Martin's proposal does not cover this feature fluffy: when do permissions get re-asked? if someone says no.... Josh_Soref: can't this be an implementation detail. <Ted_Hardie> speaker: you can have a "re-insertion" type event that makes a new device that looks remarkably similar to one that was previously excluded. <dan_romascanu> we hear 80% of the speakers Error Handling <burn> scribenick: burn <Ted_Hardie> speaker: that allows you to get past the "accidentally no means forever no" <timeless> scribe: Josh_Soref <timeless> scribenick: timeless hta: define some terms for Error Handling ... there are two things we can call Errors ... application asks API to do something ... what it does succeeds or fails ... the other thing is "oops, something went wrong" ... i'm not talking about the last category ... that's obviously going to be a callback/event thing ... i'm focusing on the Application asks the Browser to do something ... and it goes wrong ... currently, in the API, ... there's a function called ... you know exactly one and only one thing will happen ... you get thrown an exception (e.g. illegal argument) ... you get a success callback ... or you get an error callback [ Next slide ] [ Current language ] hta: this is what's in the spec atm ... i'm not all happy with the language of the spec ... it's more or less ... saying what i want to say stefanh: is this WebRTC or MC? hta: this is WebRTC burn: we decided these are not the only conditions under which an exception can be thrown ... i need to look up the specific wording hta: if you parse it as logical statement ... that one [points to something] is not what we want ... we also decided ... error identifiers are strings, not numbers [ Next slide ] [ Desirable properties not found yet ] hta: when we have illegal params ... we should have exactly the same behavior as any other API ... what compilers do... it's specified in WebIDL ... i forget what those errors are ... it should be easy for an application to predict ... if i do this mistake, and if i do that mistake, i get an error callback ... browsers should be reasonably consistent [ Next slide ] <martin__> consistency is a crutch for the weak-minded [ Alternative API structure ] hta: alternatively, you make a call, and return an object ... and then success/failure is available from the returned object ... indexeddb is doing this ... whereas we're doing what geolocation does burn: from the January call ... programming errors are thrown exceptions ... others are error callbacks martin: +1 [ Next slide ] <martin__> that was +1 to Justin doing more work [ Emulating "alternative" in JavaScript ] [ slide shows it's relatively easy to convert from one style to the other ] [ Next slide ] [ Evaluating this change proposal ] hta: there's no clear advantage ... developers will have to deal w/ both patterns anyway ... library developers can mask one with the other ... changing stuff is disruptive ... proposed: No Change stefanh: you're talking about getUserMedia ? hta: i'm talking about getUserMedia and also ... AddStream, AddTrack, GetStats stefanh: that's for the other group hta: having the two groups being consistent would be a "bad" idea ... there's reasonable overlap between the two groups ... i'd like to see if there are comments on this proposal (no change) ekr: this is just on promises v. errors hta: yes ekr: i support this decision ... i see no benefit ... are we proposing to continue w/ success+error callbacks for all calls? hta: i think so ... it's trivial to ... ... we'd like people to be unwise explicitly <martin__> adam.roach speaking adam.roach: [ something ] adambe: A style or Session style ? ... we can't force people to ... ... we can't run a call without an error callback ekr: you could <martin__> adambe: looking at error callbacks against promises - it's impossible to force the setting of error handlers on a promise gmandyam: dom exceptions, they're defined ... in the html 5 spec dom: webapps group is asking that if you want new types/errors, you should coordinate with them ... once we know what we want, we need to communicate with them jim: we define an object ... where values are only valid during the scope of the error ... there are async errors, like OOM ... which is from global hta: errors i'm not talking about are a different topic [ Next slide ] <scribe> [ New API points - design ] hta: from previous discussion ... there's no chance of getting consensus in the room to change the api ... we add GetStats(), Recorder ... we should have a principled decision for these ... use Callbacks ... use Status objects ... "no consistency needed" ... i'd like a decision on this cullen: i'd like consistency with callbacks ... for peer connection <burn> +1 martin__: i like disagreeing w/ cullen ... consistency is a clear sign of a weak mind ... be as inconsistent as possible ... invent something new [ laughter ] dom: question about consistency is defining scope ... i don't know that every other WG is agreeing w/ status object ... in one of the documents that Robin Berjon is editing ... Web API cookbook ... if Callbacks work for all cases, then let's keep using callbacks ... but there are cases involving extending the api martin__: what dom said is more along the lines of what i believe ... we have a narrow focus in this group <dom> [11]Specifying callbacks in Web API Design cookbook [11] http://darobin.github.com/api-design-cookbook/#specifying-callbacks martin__: people doing WebRTC ... and IETF ... consistency within this narrow wedge of the web platform ... isn't that wise ... if there's good guidance, then we'd be stupid to ignore it cullen: i agree with martin__ on that hta: 0, 0, 0, and we should read another document stefanh: we have onEnded events ... and onError for recorder ... we already have a mixed model ... as long as it seems fitting, we should do callbacks hta: you ask the api and it has exactly one result ... and stuff where you ask and things just happen ... this principled decision ... we have arguments for Callback ... and we have pointers to sage advice ... if we look and aren't swayed, we don't change? [ Next slide ] hta: i always have a next slide ... oh, i deleted it jim: we need to decide on error classes/not, maybe not now martin__: dom's advice ... stuff just happens advice ... we have events ... does that work for you jim? ... we generate an event and fire it ... and everything works? jim: sounds like another group wants to control those events martin__: events we know how to define ... dom was talking about DOMError jim: i'm thinking about Error gmandyam: DOMException requires modifying the HTML5 spec ... DOMError you don't jim: you have an attribute inside an object ... it's only valid within scope ... but avoids having to coordinate with another group hta: what happens if you have 2 errors happening in quick succession? martin__: it doesn't work that way ... it's a single threaded application ... you have two events ... you generate a callback ... during the scope of that callback, it's set, ... on return, you clear ... and for the next, you set the next values for the next callback hta: i'm not understanding this ... who is you? martin__: the browser ... it queues events ... and sets values for the callback scope and clears on return jim: there's another limitation ... a bunch of our objects now have an error name and an attribute dom: the DOMError interface has a `name` attribute ... which we should reuse names that exist ... but what to do when we need new names? ... DOM4 says if you need new names, contact us ... we should maybe try the path or discuss it ... on the callback question ... are there many cases where we expect the developer will need to react to several error callbacks? hta: Recorder... ... you get Data + Ended at roughly the same time ... but only one is really an error... dom: we should be clear in our algorithms when and if there are cases where an error callback invocation doesn't end the operation adambe: it seems there's some confusion if an attribute is an error callback ... or an error jim: the attribute is a string, the name of the error ... you raise a standard DOMError ... when you process that, you read the error and see what it says ... it keeps you from defining new classes adambe: why have an error string that's overwritten? ... if you dispatch objects, you can store them in a queue jim: we could also use custom events hta: we need a presentation on that, ... but by someone who has read the specs ... Further study on Errors that happen <dom> [12]DOM4 definition of errors [12] https://dvcs.w3.org/hg/domcore/raw-file/tip/Overview.html#error-names-0 <martin__> ACTION: hta to come up with a concrete proposal on what to do with error classes, based on the discussion on DOMErrors/DOMEvents. [recorded in [13]http://www.w3.org/2013/02/05-mediacap-minutes.html#action01 ] <trackbot> Created ACTION-14 - Come up with a concrete proposal on what to do with error classes, based on the discussion on DOMErrors/DOMEvents. [on Harald Alvestrand - due 2013-02-12]. hta: i think we'll have coffee cullen: what do we change? hta: atm, nothing ... but we may eventually change every class which has Error to something maybe like DOMError [ Coffee break for 15 minutes ] "immediate stream" gUM martin__: the basic problem that came up in the WebRTC meeting ... it became difficult to make a good UX ... with gUM and PeerConnection ... the consent dialog blocked the creation of the MediaStream <JonLennox> timeless — are you scribing? martin__: but the MediaStream was needed to negotiate the stream <JonLennox> (I am happy not to) martin__: the idea was that we would create placeholder streams ... we went back and forth a number of ways ... there were concerns about usability ... the conclusion was to create a bastard step child of the two proposals ... and merge them together ... there's an email on the list w/ WebIDL for these things ... it isn't really synchronous anymore ... you have the existing api ... constraints ... pass in success callback, error callback ... if you want two video streams ... you'd call this twice before relinquishing control to the browser ekr: the premise appears to be ... that i can't generate an Offer-Answer ... the appropriate SDP ... prior to having a Stream Permission grant ... because i don't know the exact characteristics ... i have no way to tell the PeerConnection what i want ... the only info that PeerConnection needs is the number of cameras ... i don't think any syntax lets it do it ... describe addressing the requirement ... regardless of syntax ... for SDP, you need more than just count martin__: there's an assumption that there's a single SDP cullen: all existing video codecs transmit Resolution in SDP martin__: you have 2 cameras and don't know what they are ekr: browser knows the cameras ... app doesn't know anything ... user hasn't granted permission martin__: you can't send resolution in SDP because you aren't supposed to reveal it before permission is granted justin: you can mask it ... send resolution in band cullen: doesn't work for any video codec ... there isn't an RFC for doing that in VP8 JonLennox: for XR ... there's XXX ... characteristics of the encoder ... the encoder implementation is probably fingerprintable justin: you can do a reoffer ... what need to do is get the ID in the ... and when i get authorization, i don't need to do additional signalling martin__: i'm less concerned about audio clipping ... as JonLennox points out <JonLennox> JonLennox: in H.264 you negotiate profiles and levels martin__: it isn't possible for video to make sense of the data from camera cullen: for audio, it's just a clipping issue ... you could always send an invite that's just a data channel ... and then send an invite over the data channel ... be careful about the problems you're trying to solve here martin__: we spent a lot of time in Lyon ... we concluded on Clipped Hello ... someone answering the phone ... you wouldn't be able to do your ICE negotiation ... so once microphone is granted ... you wouldn't be able to send right away ... for Video, people expect to wait some time ekir: i think it's useful to define the problem ... for Clipped Audio Hello ... they don't require tihs martin__: no one's proposed one ekr: cullen suggested negotiating the data channel ... in the case where there are devices w/ different capabilities ... what do you expect the Browser SDP to be ... when it isn't known what the user is going to select justin: is the case that ... the browser doesn't know what to do ... are they enumerable? ... i think they're fairly finite ekr: camera 1. H264 encoder ... camera 2, no H264 encoder ... and i don't have H264 software justin: please tell me how i'd set up that configuration cullen: you have a logitech camera justin: what percent of people will people have aftermarket camera cullen: iPad, H264 encoder for the hardware justin: the H264 there is hardware distinct from the Camera ... depending on which camera is chosen, the offer in SDP will be different ... you could offer the best you can do ... and you could always go lower jesup: on the receive side, there's an issue about receiving something you can't decode ... to the main point, how we avoid clipping ... there are options ... using fake/empty streams for negotation ... if the app is asking for permission for audio/video ... you can use fake/disconnected streams ... as soon as it gets permission, it can re-swizzle the streams being fed in ... much as you would do for mute ... you wouldn't need another negotiation ... if you need it, you do, if not, you don't ... cullen's proposal of using a half-RTT ... it's a direct to the Peer, without the server martin__: that's perfectly reasonable to do jesup: do we need another solution to the same problem? martin__: that's where i was going to get into other things this potentially provides ... how you provide multiple cameras is kinda weird ... you call gUM twice ... it's possible for the browser to give you the same camera twice justin: weird, but is it inconsistent? martin__: if i want 2 cameras ... what is the obvious api to get 2 cameras? ... we don't do it that way ... you ask for a camera ... you ask for another camera ... it gives you the same camera justin: it could give you a reference counted object martin__: but you wanted two cameras ... not the same twice justin: source ids? martin__: this slide uses source ids justin: you call gUM ... you get an object back w/ ids ... you have space for tracks ... and once you get consent, you pipe in w/ media ... once media fills in, you don't need signalling ... you don't have clipping cullen: you thought that was the current spec or the proposed change? justin: i thought that was the goal martin__: there were benefits, providing the constraints to the tracks justin: you call gUM ... you don't get the stream back immediately? martin__: you make a dummy stream ... you ask gUM to fill in the stream ekr: can i pass this stream to peer connection ... prior to gUM filling it in? martin__: partially, peer connection never really works correctly hta: it's fairly normal to call something twice when you want to do something twice martin__: if you call it twice from the same context ... if you call it once, and then settimeout and call it again ... it could behavior differently hta: ok to receive audio, ok to receive video ... we can make more constraints ... for `number of m=lines` <fluffy> M- lines burn: i like this proposal ... we'll talk about settings tomorrow ... when we originally talked about how gUM worked ... if you called gUM and ... got access ... then it wasn't available for someone else to get it ... but even in that model ... MediaStreams can be created from other Streams ... that gives you multiple Tracks pointing to the same source ... i don't know if we have to allow gUM to give access to the same source ... i don't know if you need gUM to get multiple tracks pointing to the same source cullen: i think this discussion has pointed out how confusing this is if you call it multiple times ... you create a PeerConnection, add tracks to it ... it's the Create Offer that's problematic ... i don't think you can have the Offer before you bind the stream ... i think there's existing SDP that won't work that way martin__: that's another problem w/ offer-answer cullen: i understand that ekr: everything is interconnected, unfortunately ... burn pointed out you can synthesize streams from others ... points to difficulty and a way out ... tim maybe ... another way to dig myself out ... instead of rewriting gUM ... you synthesize a dummy media stream ... w/ generic audio-video tracks ... and when gUM returns, we swap those out martin__: it's kind of what this does ekr: i'm suggesting we create a fake media stream ... and those objects have no meaningful information ... the only thing you could offer what you can do in software w/ no hardware support <Ted_Hardie1> Comfort-noise only audio.... ekr: maybe you can produce an offer ... maybe you can't burn: how does attachment happen? <Ted_Hardie1> Fluffy: the peer connection would have a replace track action cullen: peerconnection would have a replace track <scribe> scribe: Ted_Hardie1 Jesup: the original idea had the ability to take a track and construct it from other tracks and other places. ... we can re-use that idea. This is similar to mute, so you don't have to have renegotiation. Martin: turns out that you have to renogtiate if there is a serious change in the characteristics anyway Paul: apologize, because I don't follow this that well, but it seems like if you have info about the device, even if you don't have access the device, you can do what you want here. burn: I was going to say that one thing that is interesting about moving this to be a replace track on a peer connection, is that the issues only show up when you're sending it to a peer. ... they don't show up when you're using it locally ... those already look like a dummy track—the source and sink are local, but there isn't a clear notion of a track burn; we've created a virtual track, and that's convenience. All of this trickiness comes about because we're negotiationing it over a peerconnection, burn: so it might be better to doing it in peer connection, not gUM Tim: Jesup's suggestion was what I suggested in Lyon, <scribe> ACTION: item to Tim Teriberry: investigate ReplaceTrack as a solution to the problem. [recorded in [14]http://www.w3.org/2013/02/05-mediacap-minutes.html#action02 ] <trackbot> Error finding 'item'. You can review and register nicknames at <[15]http://www.w3.org/2011/04/webrtc/mediacap/track/users>. [15] http://www.w3.org/2011/04/webrtc/mediacap/track/users>. <stefanh> ACTION: Tim Terriberri to investigate replace track [recorded in [16]http://www.w3.org/2013/02/05-mediacap-minutes.html#action03 ] <trackbot> Created ACTION-15 - Terriberri to investigate replace track [on Timothy Terriberry - due 2013-02-12]. Martin: if you have an API that shows you all of the devices, but doesn't allow you tot turn them on, then you could negotiate in advance of the consent for turning them. Fluffy: instead, we're doing dummy tracks that allow you to default blank one. Those might create fingerprinting problems. Martin: you do one pixel by one pixel Fluffy: but then you need renogitiation. Martin: audio doesn't Justin: what about advertising HD, but then negotiating down Martin: that's the other way to do it—say everything you can do, then negotiate down to what you will. Justin: there seem to be two different approaches here: do we want to keep bimodal functionality? Why not just shift this to the second approach? Martin: you were there in the December teleconf, where we thought of this, and we didn't return promises or streams that weren't open yet etc. Ekr: Isn't this the same thing? What if attach this a "black" stream? Martin: seems to be reasonable to me Justin: I would prefer this be a single syntax, so we don't have two ways to do everything. ekr: Large step back: what's the problem we're trying to solve here. Is it only media clipping? room: can somebody walk through the clipping problem. Martin: walks through the set-up/user experience issues. ... when the user clicks "yes, you can have the camera", that is the expectation that it is setting up the call. Fluffy: send a provisional answer that marks them all receive only, that will set up the ice negotiation, then you can send the video before sending the offer. room: various questions ekr: this seems like a pretty heavy weight change to solve this problem. Martin: I'm just the messenger ekr: But I'm not the only person who thinks this isn't the most awesome thing ever. ... If we must have a way to set up a temp stream. I am not sold that this last syntax (fake stream plus pieces) is what we want Martin: accepting proposals ekr: replace track would have to be mine Mandy: there are some video conditions in which it is obvious that it needs to be asynchronous. I think you're going to have a difficult time finding a method that works in all conditions. Giri: You're going to have set various constraints. Martin: Note that the set constrains operates on ones that are already live Sorry, thansk. Giri: that is not the equivalent of set constraints? Martin: yes, Stefan: this is both a method to solve the clipping problem and using getUserMedia with finer grained control. ... what am I saying is that you can start sending media before you send the answer, but that won't work, because the other browser won't be able to process it. There can be multiple ssrcs arriving, and you have no clue what they corresponding. Justin: this is a PRANSWER for everything, Martin: comment 22 (expletive) Justin: why this is actually useful: you can get DTLS and ICE hot through PRANSWER—that's the critical part of why we need the IDs (to allow the remote UI to set up). The second is we already have a complicated state machine—we have a lot of other asynchronous pieces-having this be asynchronous is going to prove interesting. having this be synchronous makes the application writers' lives much easier. Fluffy: I think there are ways of solving that track/mapping issue without breaking this. ... but as a general rule of thumb, if I send early media that matches what the "dummy" thing set (two audio and one media claimed and that's what's delivered) Martin: we don't have a good way of ensuring that the configurations of streams and tracks on one end matches the configurations on the other end. Justin: you're bundling as a simple thing and you don't have a demultiplex method ... cites rtcp Fluffy: but that's statistics, not media Justin: but we can't use them by media type without going down a long complex series of special cases. Let's do this the easy way. ekr: comment not caught Jesup: I think Tim's method will simplify the state machine. ... In some of these cases when you change track, you have to renegotiate; in other cases you don't. Fluffy: the easy way to do this is to pull Bundle, that makes this simple Justin: All because you don't want to return objects synchrously <dom> [Justin's (tongue-in-cheek) question was "why don't we get rid of SDP?"] Fluffy: no, when you have complex negotiations with multiple streams and tracks Bundle is complex. It may be easy now in your situation, because it will be complex ten years from now. Justin: let's get read of SDP (groans, laughs) Hta: can you repeat your comment for the minutes? Martin: I need a plan. <hta> The idea that each video stream needs its separate source/destination IP port pair is incompatible with the SDP of 20 years ago, it's incompatible with stuff that's practiced today, and it's just plain stupid. Justin: We discarded this but we could revisit: what if media would not flow on a media stream until there was user consent? ekr: that's syntactic sugar on this. <hta> The question of whether audio and video can travel on the same port pair is a different question. ekr: I can barely distinguish these two. burn: is this a muted stream effectively? Martin: yes burn: that works well with the amended setting proposal. ... you can always have things go away. no matter what you have negotiated, that can happen ... the settings proposal talks about an over-constrained situation ... lots of things can cause this. Instead of breaking all your track connections—you can call "muted" any track that has become over constrained. ... mute anything that's wrong Justin: you get audio and video track, if it was negotiated but there was no video camera? It just stays there? Martin, Burn: if it's an immediate failure, you can just fail, but if you have it later, that works. adambe: you didn't show the last slide with advantages of the template method. ... I like return Media stream for some things, but I like this for other reasons. Fluffy: I have a concrete high level proposal, but without ditching bundle. ... You create all of these tracks, muted; once permission is granted are unmuted. Any that did not get permission are treated as if the "camera/device* were unplugged. ekr: Doesn't this create a media stream for every camera? Fluffy: yes, but the ones you don't use get blown away. Justin: I don't like having two ways to do this. Martin: I want to get rid of the constraints can be imprecisely specified option Justin: I did not get that Martin: probably Harald that wanted that. Justin: I would like to see more info on the cases in the next slide. ekr: I don't think what Cullen suggested fixes the problem ... the problem is the sdp for the platonic ideal for devices I might have ... it doesn't help me to add five camers and not know which one the user is going to select ... I am concerned that we're going to end up with replace track anyway. ... I claim replacetrack will solve this. ... I would like to solve this problem once—if we are going to ditch replacetrack or agree it doesn't work, I'm ok on this approach, despite its warts. We're going to need a lot mroe clarity on how muted/black devices behave. ... (goes back a slide) ... Goes through case with dummying out the streams—you get some cases when you get full permisions, some where you get partial, some none. Need concrete understanding for each of these ... what's the timeline on replacetrack frederick: not following the full set of technical issues: privacy issues vs. clipping. I thought what Cullen was suggesting using muting to do call set-up before media flows. Trying to understand if that's the proposal? Martin: basically yes. Burn: if you want different constraints for different tracks, this approach (as an alternate syntax) is going to be difficult. The combination of constraints and this is not going to work well. <fjh> seems like we are combining discovery with call setup with permissions Burn: I like either Justin's approach or a synchronous approach that doesn't have this issue Martin: the issue is that the algorithm for which camera gets which constraints is not deterministic. ... this imposes the constraint you don't get the same camera in two different getUserMedia Burn: You are talking about setting a constraint with source idea <fjh> cullen's example had a user action both giving permission and accepting a call leading to clipping, the fundamental issue being receiver permission (as opposed to sender)? Burn: it's a failure of a mandatory constraint if you call it twice Martin: this is a short-hand for mandatory or optional Burn: But if it's optional, it will move on when the first one is busy ... the issue is that once you get access to the source you can send them to as many sinks as you like. <dom> "Subsequent calls to getUserMedia() (in this page or any other) should treat the resource that was previously allocated, as well as resources held by other applications, as busy. Resources marked as busy should not be provided as sources to the current web page, unless specified by the use" [17]http://dev.w3.org/2011/webrtc/editor/getusermedia.html#impl ementation-suggestions [17] http://dev.w3.org/2011/webrtc/editor/getusermedia.html#implementation-suggestions <dom> (in other words, it's not normative that two calls to getUserMedia results in two different cameras being attached) Fuffy: the problem started as clipping, and I'm pretty dubious about that; then it changed to disambiguation. Fluffy: there are various games to solve different problems, but we need to be clear on what th eproblem really is. ... in most cases that I can think of sending fake sdp won't really help. ... we need to be really crisp about the problem Harald: next time you send this to the mailing list: please be clear that you're requesting two video streams, because it wasn't clear to me One thing you haven't addressed is whether the same stream that goes into getUserMedia the one that comes out? Martin: yes Harald: is it changed? Martin: the object attached to the tracks changes. Harald: if you think of tracks being replaced by getUserMedia, Martin: I don't think this proposal sees it that way: it's a pipe and now you're hooking it up to the mains. ekr: reads spec to mic Justin: reads different aspect of spec <dom> "A MediaStreamTrack object represents a media source in the user agent. Several MediaStreamTrack objects can represent the same media source, e.g., when the user chooses the same camera in the UI shown by two consecutive calls to getUserMedia() ." [18]http://dev.w3.org/2011/webrtc/editor/getusermedia.html#medi astreamtrack [18] http://dev.w3.org/2011/webrtc/editor/getusermedia.html#mediastreamtrack ekr: at best, this is inconsistent Justin: the bits I read were normative. One use case: mashup application. If you set it up so that the user cannot get multiple copies of the source, then they won't be able to use them in the mashup. Martin: good point Burn: it's interesting that the spec says what you sent—when I discussed this with Anant, he added text to that section. I am not sure where to go from that—it doesn't match. Stefan: Travis in his proposal changed a bit of that … you would get multiple access, but it would be a read-only device etc. Burn: that non-normative secution, there was a concern that we did not want to mandate how user agents would represent multiple media. ... reads a new spec section…. ... I think the spec is at best inconsistent Cullen: I'm shocked! ... but is there anything we do have consensus on? ekr: I propose what we ought to be able to do, without syntax ... somehow acquire the same camera twice and alternatively get two different camers. Anyone disagrees? Burn: yes: the question is whether you send the source multiple times or re-use, sending to multiple sinks. Then you can set different constraints on them. Martin: In the mashup, you can ask for the same source twice. ... it should be possible to do that. Harald: suprisingly, that topic is relevant to our next presentation. ... it's clear that we have no consensus, but we have two action items assigned Tim is sending mail to the list on replace track. Martin is going to mail to the list with the syntax for multiple cameras. It's clear that people have more use cases in mind. Harald: so we better have more mail to the list, showing where it matters. Justin: I thought ekr was going down a good road, trying to see what people carry about. ... f we can solve those problems, I will be happy. <fjh> +1 to clarity of use cases on list Cullen: I want to see use cases before we going down the path for "dummy track" pieces. Martin: this whole thing hinges on the other thing—what do you do with media arrives. I don't know what to do with that, but it is not what Cullen is proposing. Burn: we got into a bunch things there. I want to make sure we preserve the simple case of having multiple different video sources with one audio. I want "your two video camera". <timeless> scribe: Josh_Soref <timeless> scribenick: timeless Device reservation jesup: the presentation which only exists as of a few hours ago ... this touches on issues already discussed today ... and issues that came up in the past ... they will provoke discussion <burn> I said I wanted the simple (and original) use case of calling getUserMedia twice requesting video and get the user's two video cameras jesup: i have opinions on some of these things ... but not all of them ... i'm hoping to make progress on a few of them ... the basic things ... who gets to access a device ... who gets to modify a device ... how does it affect others who are accessing a device ... how are they notified ... how do you set up a secure call [ Slide: Who gets to access a device? ] jesup: basic possibilities for access: ... exclusive ... once one thing has a camera, nothing else can get it burn: your list of items sounds similar to my Settings proposal jesup: i'm primarily talking about multiple applications/across tabs ... our original implementation was purely exclusive ... even within a tab ... and you had to use a fake stream ... it wasn't a big deal ... but for mashups ... it matters ... so, exclusive, either by default, or by locking it ... constraint{ mandatory } ... sharing it w/ another tab/app ... so i have it and Engadget has it as well ... the assumption is that the user has in some manner ok'd this ... sharing w/ tabs that are same origin ... mostly same time ... but in some cases, in the same origin ... shared w/ a friend app ... willing to share w/ X but not w/ anyone else ... so normally non same-origin apps could exchange a permission token to allow this other app to have access ... what do we surface to the user? ... in the current Firefox UI implementation ... the user is involved in every access/grant ... this isn't the same for Chrome ... there's an implicit decision that the user is wanting to share the device w/ multiple apps ... it's assumed they know they're sharing ... that's trickier w/ persistent permissions ... the user isn't informed directly ... the second point is speaking to that ... should there be an indication that a source is in use when they're asked ... do you need an explicit grant that it's shared? ... does an application need to know that a source it has got shared w/ someone else ... to warn the user, or to change how it's acting ... before i get to my thoughts on this ... there are UCs for these things ... I have an app that is doing Voice Commands ... it runs in the background in a tab ... and it's listening to the Mic ... and i make a phone call ... and i say "computer, please bring up my spreadsheet" ... and it acts on the command ... a real world application where sharing a stream w/ something in the background is really useful ... perhaps the user needs to give permission for that, on a onetime basis, or a permanent permission ... same for Video, if you do Video-gesture-recognition justin: are there UCs where you wouldn't want this sharing? jesup: for Secure Call ... even though you gave Voice Recognition permission ... if you're doing a Secure Call, you might not want to give it access ... my gut feeling is there are UCs for that TedHardie: access to the stream ... doesn't mean access to the media ... i have a screen sharing app ... and i have a tab which is getting access to pixels ... but the source of the media capture ... is the piece of the system that grabs my screen ... not the thing that grabs my camera <JonLennox> Speaker is Ted Hardie, not sure what nick he uses TedHardie: we could say that there's a permission for the camera ... but it doesn't necessarily cover screen sharing jesup: Screen sharing while useful, is fraught with... TedHardie: users can hang themselves w/ someone else's rope ekr: is this Read access or Write access? jesup: i was talking about gUM ... there's a separate slide for writes ekr: there's a reason to not allow other accessors to fight over pan/zoom jesup: we could, but we'd be sorry ekr: there are arguments for having an explicit lock down of a device ... it isn't that you couldn't have secure ... ... i think we'll need locking for write anyway ... for Secure Coms ... i'd be satisfied for the throb that blocks write to block read ... the issue about non-exclusive-read ... is about user privacy ... whether or not having access to a device allows you to find out about other sites with access to the device ... 1-XZ ... 2. if user is crazy to give access to a site they shouldn't trust ... then we can't protect them from being discovered as using a sex site ... 3-XW ... if you think you're allowed to set whitebalance/pan/tilt/zoom justin: i think we need examples to indicate there's a real threat jesup: there's certainly a privacy leak issue there ... for x and time y, you're in a hangout ... those are in theory possible ... that said ... i think i've come around to the opinion that it isn't a good argument for disabling the feature ... given implicit permission ekr: as long as there's a way to lock it jesup: for a secure call, as long as there's a way to mandate JonLennox: is there an exclusive access and there are other apps on the device ... is it reasonable that the browser be expected to make the OS api call for exclusive? <stefanh> josh: the UA should handle exclusivity, not the app <scribe> scribenick: martin__ Josh_Soref: I don't want to leak the user to be exposed into DoS based on this constraint ... maybe the user needs a way to specify exclusive access for a given app during the permission grant jesup: for example, a personal recording app might be used, even though you want to make a call, you might want to continue recording despite the permissions requested by the app ... even if the app requests mandatory constraints, the UA doesn't need to respect those Josh_Soref: I can live with that <timeless> scribenick: timeless jesup: right now ... if user gives permission to a second app ... we let them access it ... that gets more interesting with persistent permissions ... locking with exclusive ... pure exclusive seems overly-restrictive ... an alternative is to default to exclusive ... but permissive, if the app/user says so ... i suspect you get better utility out of default-permissive ... the apps interested in mandatory exclusive are a small subset ... but you could go the other way around ... as permission doubles as selection in Firefox ... same origin doesn't matter to us ... once we get persistent, we'll need a better solution ... lastly, if i have access to a device, and someone starts sharing it ... does my app know that it's being shared? ... to indicate to the other person that it's being shared w/ someone else ... i'm not certain about this ... the mechanism is simpler than what you do w/ it and why ... an event perhaps [ Who gets to modify a shared device? ] jesup: First user, everyone? ... [random] ... first asks to lock down access? ... there are probably arguments for only one app being able to modify a device ... gets tricky ... my example of the background voice commands ... has very little interest in modifying ... but it's likely the first opener ... so it'd need to be able to specify disinterest in modifying the device ... or a way for multiple to be able to modify ... can you lock a param, e.g. 30fps ... a subset of reader-writer issue ... is it per item or the whole stream ... i don't think that complexity is worthwhile stefanh: do you have a proposed api? jesup: i don't think it's worthwhile ... so no api ... i defer to travis's discussions ekr: there's got to be a way to ask for exclusive access to pan/tilt/zoom ... i'm aware someone asked for access, but ignore him jesup: at least a lock, or an exclusive writer ekr: anything manipulable on this device ... i have write, and anyone can share read <Travis> I think one interesting part of this discussion is being able to manage the exclusivity of a source. ekr: i could imagine living with a setting where there's a way ... for the other guy to steal the lock ... but i don't know how to have shared devices and have fights over writes jesup: that could be resolved culturally martin__: in the settings proposal ... you ask for a track w/ pan/tilt/zoom constraint ekr: i'm going to move them around martin__: you don't set a range, you set a value ... if someone else sets a constraint, they'll be told they can't change it ekr: but that's exclusive writer martin__: that's my understanding of the settings proposal ... you set pan/tilt/zoom for the source ... and it's incompatible w/ anyone else setting values jesup: for pan/tilt/zoom ... i can see plenty of cases where multiple people want to have access to control something ... in a shared control situtation ... it isn't smart to be able to control at the same time martin__: exclusive for the first guy to set a property jim: can you get deadlocks? ... there's a track or device detail burn: that's why i asked about within an app or across apps/tabs ... solutions may be different ... settings has ways for within an app ... on what you do for a track which is shared ... we can talk about that tomorrow jesup: i'd like to focus on between apps TedHardie: i wonder about a way to give up exclusive access ... you set pan/tilt/zoom ... from a remote location ... but allow people to change it <Travis> +1 This resonates with me... jesup: multiple rooms dialed in ... you set a value ... but let the room change it <martin__> which resonates with you specifically travis? TedHardie: does that work for you? jim: you could have a way to decide to allow it? jesup: you get to an area of diminishing returns <Travis> Exclusive by default w/option to give up your rights... TedHardie: if you have the ability to give it up, seems enough <martin__> Yeah, that's sounding good to me. jim: do you get notified if it's changed if you give it up? TedHardie: no ... you notice the video looks different JonLennox: pan/tilt/zoom isn't how it actually works [ Security and Trust Model ] jesup: this maps to the whole sharing thing ... by default an app gets access to bits ... if you don't trust the app, it can send them anywhere ... not directly related to what we just talked about, but indirectly ... a local photobooth app ... could connect to a peerconnection and send it anywhere ... you play w/ a sample app ... and the person who wrote it could be watching you ... that's bad ... the first time it happens, it'll get in the press ... it'll be bad for all of us ... we need to think about it ... we haven't solved the problem martin__: i think we had ekr: tomorrow, ... i'll be talking about restricted access on a binary basis ... for the restricted media streams, that's covered jesup: preview for tomorrow's discussion ... but think about that for sharing media streams ... you may have to lock down streams in secure mode ... a lot of useful ... UCs may suddenly become hard ... you must trust those apps ... that they don't ship your data off ... the other option would be to constrain ... give access to bits, but not allow them to ship ... you could prevent them from wiring to a PeerConnection ... but they could put them into a Canvas and send them to same-origin Dan_Druta: the browser should know ... if there's a bit jesup: but it doesn't have to be an RTC session ... it could be WebSocket or XHR justin: is this a solvable problem? ... we spent time this morning talking about "easy" problems, ... and didn't make progress adambe: for a booth ... if i do funny stuff with a secure file jesup: there are existing cases of people hacking into computers and getting mics/cameras ... and using them for blackmail ... justin's right, if this isn't a solvable problem ... we should just put up warning flags justin: if you give access to the camera ... is there an expectation that the site won't upload the data to a site? jesup: right now, there's no way to block this ... there are technical ways we could do so ... via tainting and origin protection ... you could let them manipulate w/o bit access martin__: if you give bit access, you've given bit access ... if you've tainted the stream, you aren't giving bit access jesup: in that case you've given stream access but not bit access ... you could hook it up to <Video> but not <Canvas> ... this also applies to image security in browsers... mostly justin: what can you do that's interesting if you can't access the bits? gmandyam: on raw bits ... i thought this was encoded data jesup: gUM is raw-bits gmandyam: hta corrected me, there's no such thing as raw-bytes ... it's feasible to eventually get raw data ... our devices have face detection and such ... but the argument for taking it out was you could do it in JS jesup: there are other ways to ... if we revived media stream processing [ Time Check ] jesup: such that you could get access to the camera, manipulate it for face recognition ... but not give access to the bits w/in the app ... it's tricky, i agree, it's doable ... i'm not sure it'll get done ekr: we'll talk about this tomorrow ... interesting reasons for Calling/Preview apps to render or transmit ... anything more complicated will be very complicated [ Security (cont) ] jesup: leave most of this for ekr's discussion tomorrow ... enjoy lunch [ Lunch - 1 hour ] hta: we've identified needs for some degree of locking <Travis> See ya all tomorrow! Summary of Action Items [NEW] ACTION: hta to come up with a concrete proposal on what to do with error classes, based on the discussion on DOMErrors/DOMEvents. [recorded in [19]http://www.w3.org/2013/02/05-mediacap-minutes.html#action01 ] [NEW] ACTION: item to Tim Teriberry: investigate ReplaceTrack as a solution to the problem. [recorded in [20]http://www.w3.org/2013/02/05-mediacap-minutes.html#action02 ] [NEW] ACTION: Tim Terriberri to investigate replace track [recorded in [21]http://www.w3.org/2013/02/05-mediacap-minutes.html#action03 ] [End of minutes] __________________________________________________________ [1]W3C [1] http://www.w3.org/ Media Capture Task Force F2F Meeting 06 Feb 2013 See also: [2]IRC log [2] http://www.w3.org/2013/02/06-mediacap-irc Attendees Present Travis_Leithead Regrets Chair hta, stefanh Scribe Josh_Soref Contents * [3]Topics 1. [4]Recording Consent 2. [5]Settings and Constraints * [6]Summary of Action Items __________________________________________________________ <trackbot> Date: 06 February 2013 <dom> ScribeNick: gmandyam hta: going over conclusions from Feb. 5 ... Device enumeration has been concluded to be necessary, but there are privacy issues. Martin: privacy issues were not in scope of current proposal, but for suggested expanded capabilities. Additional device info could potentially improve privacy. hta: Error handling will not change from current model. Cullen: we should be consisten as to how we do error handling hta: "Immediate Stream" ,i.e. synchronous gUM. This belongs in a sense to PeerConnection. <dom> [7]Media Streams and Identity slides [7] http://www.w3.org/wiki/images/9/90/Gum-identity-interim-2013-01.pdf [ General Idea: Grant Media Access to an Idenity ] [ Topics ] [ Stream Isolation ] <timeless> Scribe: Josh_Soref <timeless> scribenick: timeless [ History ] ekr: we got three proposals ... I presented a 1st draft in Lyon ... i never got around to writing it up ... this is a detailed description of my proposal [ Three ways to call gUM (all constraints) ] ekr: * Call gUM() ... * Call gUM(noaccess=true) ... I won't touch the stream ... * Call gUM(peerIdentity=bob@example.com) ... This stream will only be sent to a certain identity [ Call gUM() as normal ] ekr: current behavior unchanged <gmandyam> Hey Josh, are you scribing? I'll quit if you are. - Giri [ noaccess=true ] ekr: Media permissions checks as usual ... if site gets access, it gets permissions <gmandyam> OK - let me know when you need a break ekr: stream is isolated ... keying must be via DTLS ... you [martin__] will say no to identity assertion martin__: this will enable creation of identity assertion based on stream ekr: once peer connection is established, ... it either knows peer identity ... browser displays identity and "no access" indicator ... WebEx etc want to guarantee to others ... and customers want to check ... that the site didn't screw them over [ Flow slide ] ekr: indicator has an identity assertion [ peerIdentity=bob@example.com ] ekr: originally suggested by cullen? ... more enhanced from user perspective ... people seem to be concerned about allowing a site to have access for perpetuity ... "i only want to allow site to call X and Y and no one else" ... site indicates to user who the content will be sent to ... either RFC822 style email address ... or Browser if it has Contacts accesss, it could show name + icon TedHardie: for Poker site example ... would it be skiptracer@pokersite.example ? ... but could you do player1@ ? ... or would you need a different mechanism ekr: i think you'd outsource identity assertion martin__: to do a Player 1 ... you need player1@pokerstars.example ... the name is in this form ... and the domain for this domain can do the identity assertion ekr: you need to be able to mechanically identify this person TedHardie: this isn't valid for the poker table case without pokerstars preminting martin__: just minting Dan_Druta: there's a concern about information harvesting gmandyam: what if you want to do a native media player ... if i want to use the recording api martin__: you'd use the gUM(noaccess=false) ... you don't need to set the argument gmandyam: what would this mean for access? ekr: access will be on `origin` basis gmandyam: for access to raw media? ekr: you'll still have that ... this is a new way to restrict access gmandyam: what motivates a web site to do this? cullen: sites like WebEx want to be able to assure customers that they *don't* have access to the media ... they'll choose to use this mechanism ... and users could verify this by something morally equivalent to the lock icon today justin: what does it mean for WebEx to not have access to the media? ... surely they'll have the data cullen: WebEx won't have the keying material ... for a Point to Point Call ... it knows martin__ called ekr ... and for a TURN server, it might all flow through ... but it doesn't have the keying material ... for a 10 person conference ... you might know who is sending the content v. spreading ... but you won't know what was sent ... for WebEx or Google Hangouts ... if Microsoft and Silverlake ... are doing some deal about how Cisco would go out of business ... they want assurance that we aren't listening justin: but in a WebEx conference ... the central node won't have any access to the media? ... what assertion covers the end to end ... that it won't have access to the media? cullen: for 1-1 i'm saying we can do it aroach: if identity asserter and site you're on ... are the same or are in collusion, you're ... ekr: you're hosed ... this is about a site being able to provide you with evidence ... but you have to trust the source of the communicating counterpart PeterThatcher: we're pushing trust from site hosting JS ... to site hosting identification ekr: or a site with extra ... PeterThatcher: are we confusing the user ... about do you trust X v. Y.com ? ekr: always some risk ... people have some sense that domains control identities ... some sense, ... companies reassign email addresses ... Gmail can redirect email somewhere else ... 2. there's a lot of work in distributed identification systems ... BrowserID ... it's a lot more straightforward to do a calling site ... than an identity provider ... so your identity is likely to be by someone you trust ... but the caller site something less trusted ... 3. this is what we know how to do ... MediaStream can only be connected to a PeerConnection with identity provided ... completion would choke if call goes elsewhere [ ekr explains call flow ] ekr: that's the entirety of the mechanism ... i can send to the list on the details ... but it's simple gmandyam: can you go back to the call flow slide? ... this is entirely voluntary ... the web site can contact the ID provider directly ... the web site can put up its own dialog box ... it can ask user "can i send to bob@example.com" ... why add to gUM ekr: this is for Identity where you don't trust calling site ... this is about denying calling site access to media gmandyam: the web site can get access to the media by not doing this ekr: this is a way to build a site where the user can verify the site isn't sniffing cullen: there's a reason the browser needs to show the Lock icon ... instead of the web site showing "secured by Thawt" ekr: they do that to though ;-) cullen: this is about the chrome of the browser showing the assertion ami: how does bob tell his PeerConnection that he's bob@example.com ekr: that's defined in IETF drafts jesup: in addition to who puts up the assertion ... and whether they can access the media ... even if you trust the site to not access the media ... the site can always be hacked ... so trusting them ... if they're hacked ... worse case they won't get the media ekr: i'll send detailed text for the proposal Recording Consent ekr: W3C (will have?) published recording API as FPWD ... do we need special recording consent? [ Basic Recording Concept (review) ] ekr: i think i'm summarizing correctly ... MediaStream ... feed it to MediaRecording() constructor ... js gets direct access to media [ Why the special permissions? ] ekr: quoting Ben Pedtick ... there are regulatory requirements for such notifications [ Not the only way to get direct access ] ekr: people suggested building it into the browser interface ... Pipe MediaStreeam to PeerConnection ... record on server ... loop back to client ... video can be read to Canvas ... no technical measure can distinguish recording from a webrtc call to an arbitrary location ... no matter how many dialogs we pop up ... there's a way to build something that avoids the dialogs [ Proposed Resolution ] ekr: what you want to do is often to have the recording on the remote side ... I'd propose not having extra permissions ... and say it's the application's job to deal w/ regulatory requirements ... Chrome and Firefox already render indicators that the media is in use ... we could add a UI for a tape drive ... but i think that should be done in content Josh_Soref: +1 martin__: +1 ... concern that we document this decision in Security and Privacy PeterThatcher: when you're telling the user that media will be sent ... would it be good for the browser to have fine detail that the media could be recorded <dom> ACTION: ekr to draft text on no-permission-required for recording for inclusion in the mediastream-recording doc [recorded in [8]http://www.w3.org/2013/02/06-mediacap-minutes.html#action01] <trackbot> Error finding 'ekr'. You can review and register nicknames at <[9]http://www.w3.org/2011/04/webrtc/mediacap/track/users>. [9] http://www.w3.org/2011/04/webrtc/mediacap/track/users%3E. <dom> ACTION: eric to draft text on no-permission-required for recording for inclusion in the mediastream-recording doc [recorded in [10]http://www.w3.org/2013/02/06-mediacap-minutes.html#action02 ] <trackbot> Created ACTION-16 - Draft text on no-permission-required for recording for inclusion in the mediastream-recording doc [on Eric Rescorla - due 2013-02-13]. PeterThatcher: as part of the Chrome <gmandyam> Clarification: Permission check only on gUM, not for recording is current proposal. This is independent of ekr's proposal about a peerIdentity argument for gUM. ekr: "maybe this is being recorded and sent to your mom" PeterThatcher: if there isn't identity stuff ... "maybe it could go anywhere" cullen: when you turn on your computer, you may be recorded ... and when the green light goes on, you *are* being recorded ... i'm fine with "documenting that recording happens" Settings and Constraints <martin__> and the only reason that I made the suggestion is so that we don't have to have these sorts of discussions ever again [ Dynamic usable constraints ] burn: for people who want to use media locally ... the constraint mechanism didn't do well for what you had to do in the browser ... Travis spent a lot of time thinking about how this would work ... the original constraint proposal was for selecting a stream ... but there was no ability to change constraints over time <dom> [11]Constraints and settings [11] http://www.w3.org/wiki/images/f/fd/Constraints20130406.pdf burn: what if you want to change the settings of the camera ... Travis came up with the proposal ... he did most of the work, and gets credit for what's good ... i get credit for what's confusing [ Constraints mini-review ] burn: a brief overview ... when requesting media, you can give a set of constraints ... mandatory and optional ... both sections are optional ... mandatory section says ... "if you can't give me a track that satisfies these constraints, then i want an error" ... the optional section is ordered ... these are constraints i'd like to be satisfied, but if you can't, that's ok ... maybe you ask for Width, and aspect ratio ... but maybe Width is more important ekr: we still have audio:true/video:true? burn: yes ... i'm not even showing this is for video or for audio ... you have separate constraints for audio and video ... there were differing versions of for audio or video cullen: can i have a width-min in one row ... and below that a width-max below burn: yes ... i pulled this from an example ... you can have constraints multiple times (e.g. width-min) ... earlier ones are higher priority [ Based on Travis's Settings v6 proposal ] burn: there was a little email traffic, not a lot ... when Travis and I discussed it last week, a few things required tweaking ... some things are slightly different from that proposal [ The Problem ] burn: it's called the Settings proposal ... even though v6 removed "Settings" from the proposal ... setting width to a number ... say 650 ... is by setting the value, or using min+max to the same value ... here is why we have this proposal ... when you go to get video from the camera ... you could display it in multiple <video> elements ... send it over PeerConnection ... multiple sinks ... multiple Tracks with the same underlying source ... you could have different requirements for the underlying source ... what you want to send over PeerConnection could be a different framerate than what you show ... when you send to a Sink, that could have different characteristics than ... your source ... and the browser has to come up with something to do ... capabilities of source may change dynamically ... something the browser does ... or the user does ... the camera could go away ... or be physically muting it ... you could sometimes want a thumbnail ... or sometimes want a full video ... Sink could change on the fly [ Home Client ] burn: Video Camera generating stream of 1920x1200 ... two tracks with the same source ... neither constrained ... each track is attached to its own <video> ... one <video width=1920 height=1200>, one <video width=320 height=200> ... the browser will upscale/downscale ekr: in most examples today ... people don't generally create Tracks ... is the idea we do gUM() once? burn: how you get different video sources ... vs. the same ... there's some discussion ... the next slide may ... it may not matter that it's two tracks or one [ Next slide ] burn: say the 1920x1200 video is changed ... to <video width=1024 height=768> ... the browser will likely scale, as it was for the other sink ... this change doesn't require the source to change jim: ekr's proposal ... if you set the stream element as isolated ... would the js be able to do anything with it? burn: you can have a source that's remote, or from file martin__: tracks that you get back can be manipulated in any number of ways ... by setting constraints ... you just can't manipulate the pixel/audio data burn: this proposal is independent of that permission discussion gmandyam: the examples here are downscaling ... if you do upscaling ... and there was a mandatory constraint ... can the UA do postprocessing to enhance the video burn: upscaling and constraints-mandatory gmandyam: you have a source ... and the browser wants to avoid making it grainy ... so it postprocesses which increases delay burn: in this example, there are no constraints set on the track ... that's important for this example ... i'm happy w/ whatever i get from the camera ... and whatever it needs to do ... if it needs to do upscale/processing ... i'm fine with that in this example ... regardless of what the source provides ... and track constraints ... the browser will try to do the most intelligent thing it can ... if the track goes away ... that's an error ... otherwise, the browser will do something reasonable ... if the application doesn't like what it did ... then it can set constraints ... to control things available to the sink cullen: we had a long discussion a year or two ago ... <video> should be the only thing to do upscaling ... Track won't ... Track could Downscale ... if you duplicate for something that will be smaller ... the Track could downscale ... and as a general rule of thumb ... things should choose largest possible within constraints burn: cullen is describing guidelines ... not to go into the spec as normative text cullen: understanding that this is what will happen ... makes it possible to understand that this proposal would work burn: understand that browsers want to do these things ... this is an interface ... for the application to talk to the browser ... but still allow operation [ Next slide ] burn: although the decrease in the largest <video> tag didn't mandate a change on the video source ... it may result in a change by the browser ... if the browser wants to inform the source ... that it'd be better to get a 1024x768 ... that has an effect on Tracks ... but not Constraints on Tracks martin__: there's an Aspect Ratio change ... in the original state ... the 320x200 probably had letterbox ... as a result of changing the display in the big <video> ... you potentially changed the source ... which removes the letterbox in the small <video> ... it isn't intuitive, but it makes a lot of sense burn: but if you don't like that behavior, you could create constraints for it [ The Proposal Summary ] burn: separate notions of "source", "track", "sink" ... Media Capture Streams document doesn't give direct control over source or sink ... only tracks ... source configuration can change at any time ... browser will adjust what's sent to Track ... or have an event [ The Proposal (Expanded) ] burn: what's a source? ... Camera, microphone, RTCPeerConnection, file, Screen Capture ... no direct acces or control by app ... browser configures ... what's a Track? ... it's a constraint carrier ... and it holds statistics ... logically, it represents an incarnation of the media ... you could create 2 tracks with different constraints ... if you attach them to a PeerConnection ... there could be different resolutions of media sent on the two ... What are sinks? ... <video>, PeerConnection, Recording ... No direct access or control via Media Capture/WebRTC APIs cullen: can Track connect to Track? burn: no ... i'd have to think about what it means martin__: not on slide ... is MediaProcessing API ... that takes Source Sink and produces a Track ? hta: in the original proposal ... we could connect Tracks to Tracks ... after scratching our heads for months ... we removed it Dan_Druta: we haven't discussed <source> being a <video> element ... i think recording should allow saving a streamed item burn: this isn't an exhaustive list Dan_Druta: i like the concept of source, track, sink jim: +1 burn: room full of programmers, no explicit "..." then it isn't there [ laughter ] ekr: i have a camera w/ 4x3 ... but i have a mandatory constraint for 16:9 <fjh> +1 to concept of source, track, sink ekr: is it acceptable to pillow box it? burn: Tracks have constraints ... if you request a 16:9 mandatory constraint ... you'll get a failure right away ... that media can't give you that martin__: if you want pillowbox ... you'd create it w/o that constraint burn: it's up to the browser ... if you say you want to receive a video that's 16:9 ... a browser may try to be intelligent and do that mapping for you ekr: if we start moving constraints away ... from sources ... i start to wonder how much i'm being allowed ... do i lose the ability to burn: having a constraint for `native resolution` might make sense? custin: i have a 4:3 in ... and i want to munge to 16:9 ... by cropping top+bottom ... i ask camera to give me 16:9, i'd want it to do that for me burn: we'll have to define that in the aspect-ratio constraint hta: we're descending into details ... i'd like to get to the rest of the slides justin: looking forward to future discussion burn: i believe it can accommodate ... there's a lot in the browser that's impl depenedent jesup: i think it's reasonable when you apply a constraint to a Track ... the constraint could be satisfied by the source ... how much could be impl dependent cullen: on constraints that imply change to the media ... the standard should be such that browsers are consistent ... if setting a constraint causes media to get twisted around a Torus <martin__> the thought occurs that mandatory and optional aren't the only categories in this "constraints" taxonomy: the mandatory category might be split into "fail if you can't do this" and "make it this way, no matter what it takes" cullen: it has to happen the same way in all browsers ekr: it isn't that i care about aspect ratio <JonLennox> martin__ - what's the difference? ekr: i'm concerned that if one way to satisfy is DSP processing ... then how constraints actually operate ... for device selection ... you could satisfy and constraint-A is more important than constraint-B ... but you choose to do DSP on device-1 burn: personal opinion is browser shouldn't do munging from Source to Track ... Track has certain ... it's the Sink that does the mapping Paul: a little fuzzy ... the browser can control the source ... to make it better suited to the constraints of the track ... what about pan/zoom? burn: we talked about pan/zoom yesterday ... and i realized it works differently martin__: pan/zoom might be different ... like the constraints ... must not request processing of source ... if we make it clear that "must not do processing" burn: i agree ... that's been the mental model i've had the whole time derf: +1 to martin__ ... any constraint that requires DSP'ing the source ... we shouldn't do ... on some platforms it'll be good, and on some, it'll fail burn: hearing +1s ... anyone who does not want Sources to give Tracks directly what is asked for in constraints w/o DSPing? justin: can we avoid the double negative? ... is cropping DSP? cullen: what about Echo-Cancelation? burn: if source provides something a certain way ... how the source provides it is acceptable ... it's the browser doing it that's a problem justin: if camera does VGA and i want 16:9, there needs to be a way to get it martin__: key is browser not doing stuff ... if you want 16:9 from 4:3 <Travis> Let's not forget that there are track constraints and sink constraints (video tag) martin__: you can push it in a 16:9 <video> cullen: if i gUM() and i directly connect to <video> it shouldn't need me to pull out and reprocess hta: we're discussing unobservable control surfaces ... there's no way to discover the difference between to track / from track ekr: it is burn: it is with PeerConnection ekr: what to do w/ 4:3 input and 16:9 display cullen: i want to send something specific and your camera is different ... i need it to just work ... and cropping or letterboxing needs to be fairly consistent hta: get through the presentation? ekr: example case of general principle jesup: interested to hear what ekr says ... consider native resolution of camera ... may be willing to give any res ... but many will DSP to give any res ekr: constraints is to prioritizing which device to select justin: sourceid select a specific device ekr: i have 2 cameras ... one suitable for aspect ratio i want ... one much less suitable ... first constraint is aspect ratio ... browser decides to rescale video ... choice alg is selected by ordered choice <dom> Josh_Soref: I'm used to getUserMedia with a pop up allowing the user to select a device that matches the constraints set by the application <dom> burn: when multiple sources can satisfy all the mandatory constraints, it goes through the optional ones to further filter <dom> ... if after that there are still more than one, it's up to the browser to select <dom> ... where the user can be involved in the user <dom> scribenick: dom burn: let's finish the slides to look at the peerconnection case <Travis> There's a dichotomy of desired full-and-absolute control by the app, and variability wiggle-room by the UA. As long as UA can have wiggle-room, then there will be cases where two implementations will be different. juberti: real world cases of multiple devices is front camera has lower resolution than back camera ... but in many cases, front camera is what you will want ... Are there many cases where resolution matters more than what the camera is looking at? burn: there are many cases where your app has specific resolution requirements ... if your app cares more about the orientation of the device, then that's what your constraints should impose ... the app has to pick its constraints in accordance with its needs ... there may different behavior for selecting via constraints, and controlling via settings <Travis> I'm concerned about splitting those scenarios (selection vs. settings changes) burn: settings is "make it so no matter what", whereas constraints is "only give it to me if it is so" randell: I'm concerned we're doing a lot of procedures, algorithms, failure cases, for relatively small slice of applications that require very specific device characteristics ... for apps that don't want to let the user pick what the best camera/mike would be burn: I would argue that applications care a lot about that ... e.g. to match what native apps can do <timeless> scribenick: timeless dom: i'd prefer we don't say many apps that do this ... but specific examples ... computer vision is a small UC burn: how many people have developing Web Apps as their primary job for them or their companies ... I see 2 hands ... be very careful about optimizing for impl ... when we don't have primary users ... my company is more interested in enabling users to build specific apps ... i don't do that ... but the people i work w/ do adambe: constraints when you select a device v. setting on Track ... discussion about are constraints remembered ... can i read out constraints from the track? burn: i'll get to that adambe: when someone sends me a track over PeerConnection ... the PeerConnection is from my PoV is the source <Travis> We need to finish the slides... adambe: it's configured some way ... can i read out those constraints from my Track? burn: let me finish the slides [ Home Client; Away Client ] Local source 800x600 video camera Three local sinks: <video w:1920 h:1200> <video w:320 h:200> Remote sinks: <video w:150 h:100> <video w:1024 h:768> burn: there may be constraints set here martin__: what makes the 1024x768 in the peer connection ... is that the packets from the video over the wire hta: i'd like to let burn to present 3 more slides burn: i didn't say if the tracks were constrained or not [ Next slide ] burn: video source changesresolution ... 1920x1200 ... track content may change, sinks still the same [ The Proposal (Expanded)] burn: constraints on tracks, not on sources ... browser does its best to adjust sources to satisfy constraints ... roughly intersecting constraints ... it's the browser's job to satisfy combined constraints set ... constraints can be modified on the Track ... replacing prior constraints ... Tracks can become overconstrained ... 1. when you initially request ... 2. later in process, video source isn't available, or it changes ... in that case, there will be an OverConstrained Event [ Next Slide ] [ Home Client ] video camera w:1024 h:768 fillLightMode: on [at creation ] <video w:800 h:600> N <video w:1024 h:768> P burn: the other track sets a mandatory fillLightMode:off ... it will get an overconstrained event (to the track) ... and the track is muted ekr: how do i recover burn: you the overconstrained track set a handler ... and you set different constraints ekr: do i know which is the problematic constraint? burn: the intent with the current spec is to give you information cullen: if N and P set them both to On ... and then N sets it to off, it will fail? burn: yes cullen: last guy who tries to make an inconsistent change to the resource loses martin__: that doesn't preclude locking discussion from yesterday burn: that's why i asked about between tabs martin__: this would work ok w/in an app ... and ok between tabs in non-exclusive burn: "no required side-effect on source" ... P has a res of 1024x768 ... N has res of 800x600 ... P was getting nothing ... but camera was still generating 1024x768 [ Next slide ] burn: there's no requirement that the browser do anything ... but the browser may change camera setting to match remaining sink ... up to browser policy ... permitted but not required martin__: you were concentrating on optimization ... i thought it would end burn: muted Josh_Soref: it matters because it can resume burn: v6 proposal said overconstrained ... caused all tracks to go into an ARM (?) state ... here, the right thing is that the track causing overconstrained to muted martin__: softer, you try to set ... and instead it fails w/o muting <Travis> Constraint application cannot fail. <Travis> It is an overlay set of requirements burn: i like that adambe: i like martin__ 's proposal ... that when you set something that doesn't work <ekr> I also like martin's proposal adambe: it discards the latest setting and continues <ekr> also martin_______'s proposal adambe: when something happens and it isn't something that you did ... when the source changes ... in some cases, it might be nice to pop a constraint burn: say the camera is unplugged ... all your tracks are severely constrained stefanh: aren't they ended? martin__: ignoring shit happens burn: if there's a switch on the camera that's on/off Josh_Soref: it might be a "mute" stefanh: i think we have an onmute event martin__: you may not have software control over mutebutton burn: if something happens in the world ... we need to have a way for things to go to muted ... instead of just going away totally martin__: i don't see any reason not to allow to continue to receive the stream burn: user will see something martin__: you listen for event, ... and if you care, fiddle burn: don't change track [ Coffee Break 15 minutes ] [ The Proposal (Expanded) ] burn: we have Capabilities, and the current state ... capabilities and states are properties of a source ... - not a track ekr: same namespace for keys? burn: yes ... on thing that might change is that Selection v. Control ... for Control case, you might want current state to be on Track and not Source ... your Track might be requiring DSP munging ... we use the term "released" ... "once a source has been `released` to the application (via permissions, pre-configured allowed, or other) the application will be able to discover additional source-specific capabilities." [ The Other Details ] burn: the constraint registry talks about min-max and enums ... this proposal also includes primary types (strings, ints, float, boolean) ... i'm tempted w/ adjusting the registry ... there are also source ids ... i got a source id the last time i was here and saved it ... you may also be able to get "video", "audio" and the complete set of source ids ... proposal has separate types for audio and video ... and explicit creation of Tracks (not currently possible in gUM) ... this proposal adds an explicit constructor for Audio/Video tracks ... which you then call into gUM [ Proposed (mod of Settings v6) ] burn: it talks about tracks being audio, video, "none" ... that a track can be "readonly" or "remote" ... you can have "readonly" "video", or "remote" "video" ... readonly/remote are other states ... we're not sure exactly how it would be used martin__: not a particularly good answer ... i think you need to go into more about readonly ... if you insert it, you have to justify it burn: Travis inserted it martin__: i can't write video to my camera burn: readonly is intended to mean you can't change settings on a source Josh_Soref: one case is that it's a file stream ... but that discloses it's a file stream martin__: i'm not sure we need it ... if we don't kill the track for applying settings ... then i don't know we need it burn: ok [ About the proposal ] burn: there were a small number of comments to the ML ... it's a long very detailed proposal ... probably very few people read it ... i wanted to talk about it ... stefanh said that the editors would start applying ... but cullen shot it down martin__: since cullen isn't here, i think it's very good stefanh: cullen made that comment after reading the wrong version martin__: there's a huge body of stuff that's very useful ... we can fix up the little bits later burn: i agree, although we need to work out selection v. control ... it's the only thing major enough to work out ekr: 1. is this the right API ... 2. how do we feel about the specific field values ... suggestion is we take these in two stages ... 1. do we think it's the right arch? ... 2. do the list in pieces burn: i agree w/ you ... my plan for today ... was see if we can agree on Arch ... then start applying that ... if we agreed today ... we'd go on to the capabilities list ... not try to agree on things to go in, not even describe all of them ... these are many that have been proposed ... i wanted to poll for levels of interest on these ... to help me order for a start of a discussion [ About the proposal ] burn: open the floor to comments PeterThatcher: lot of complexity ... concern that browsers do the same thing <Travis> It will be very very hard to get exact browser parity when there is wiggle-room in terms of how to handle unconstrained tracks. PeterThatcher: hard to see the browsers doing both right burn: do the same cullen: words i have are "have the same UX" PeterThatcher: has there been consideration for a lower level api? ... less magic in the browser burn: let me speak about history ... discussed for a while ... 2 camps of people ... those who believed the browser should make decissions ... those who believed JS developer should make decissions ... constraints was an attempt to find a middle ground ... let JS developer specify at the level that matters ... and let browsers be wise beyond that ... it seems there's a significant camp who want more control after selection ... that led to the proposal of a settings api ... it's still a settings api for control after selection ... but it still uses constraints language ... so it still gives flexibility ... one other aspect here ... privacy/fingerprinting ... there was concern initially that giving the JS dev direct complete access ... ideally from a App dev ... they want complete list of devices and take absolute control ... we heard from a number of browser devs saying that isn't a good idea ... if we relaxed our notion of what we can give to devs ... or if we came up w/ a permissions set ... to give more control ... i think this is the best compromise of what you give the App dev ... to allow control ... and to let them to be notified when it can't be ... App says "if you don't give me this, i can't run" ... it's a long winded answer jim: Recording also has Capabilities and settings ... Recording doesn't have Constraints burn: what if you have multiple recorders? ... you could say you have constraints but none jesup: you talked about how we got here ... largely fingerprinting and control burn: and selection jesup: before user gives consent ... concerned about giving full details of devices ... propose an alternative solution to that conundrum ... allow JS to have access to device list and capabilities ... within a JS worker with a different origin ... it's constrained to return an output of a sourceid burn: i'll let ekr comment jesup: you rate limit how many times you call it ... to avoid bit yielding ... you remove a very complex hard to make sure it's going to work the same way ... i'm concerned about implementing this and testing it ... and making sure it works in configurations ... i'm tempted to give app writers ... and say to app writers "we have this interface, don't use it" ekr: jesup's proposal could work ... leak of 20-30 bits ... limit to 2 queries per second ... we're done in 8 seconds ... not sure it solves the privacy problem, not sure i care either cullen: i think we're coming up w/ a lot of stuff that works for settings ... dealing w/ tracks and control them once you have access ... the selection part is where we're stumbling more burn: it's been in the document longer cullen: you can see the widespread agreement </sarcasm> ... generally, it's about the user choosing the camera they want ... i wonder whether we should ... ... the privacy history ... i'm becoming ... no one has ever identified attacks ... once you could do Fonts on Canvas ... as we go down this path, i'm trying to understand why we're going down that hole burn: i hear you ... adding settings is more complex than selection ... i don't think we have new arguments against selection different from when we started ... it definitely should make the easy case easy ... that's still easy ... it handles intermediate case nicely ... you don't need device ids ... or a stored database of every camera on the planet ... i'm nervous about chucking it and starting over martin__: jesup's suggestion ... other than ekr's bits ... means you can't build the UX we're talking about ... w/ a source picker in content ... with the rate limiting of the worker ... on cullen's point, i want to think about it burn: the people arguing for settings control aren't here cullen: the model here for settings seems ok ... but selection seems less understandable ekr: why i like having selection ... 2 working examples in browsers ... Chrome... there's a browser global camera preference ... do you triage? justin: we pick the global camera pref ekr: firefox prompts you all the time ... 3 models for how selection should work ... #1 it's a browser setting ... #2 it's a user controlled setting ... #3 it's a JS-Browser negotiation ... #4 in content selection ... this quasi magic thing isn't what people want ... Control in Chrome, in Content, in neither? ... in content selectors have these privacy risks Josh_Soref: it seems like everyone's happy w/ merging Settings in ... but we might want to drop out the Selection bits hta: we have action items about Selection from yesterday ... but today we're talking about Changing Constraints aka "Settings" ... once the line empties ... (if ever) ... i'd like to call consensus to incorporate this version of settings manipulation into the document ... knowing that how we do selection is still dependent on Action Items from yesterday jesup: i'm sympathetic to that we've put a lot of time into this ... a lot of effort ... you don't really control that ... you can't apply that constraint on a mac <fluffy> OSX computers will give you whatever resolution you request regardless of capabilities of camera jesup: "i know from a DB this camera can't do that" ... my concern about selection constraint ... is people will overspecify, locking out the user ... even if it isn't the perfect camera ... i'd like to really understand UCs that drive those ... to override user's ability to choose burn: bad apps can always be written ... bad apps won't be used for very long Dan_Druta: i'd like more clarity ... no guarantee about control burn: all tracks connected to source receive an overconstrained event ... we had a discussion about different option ... i liked proposal of not killing-muting track and just sending event Dan_Druta: for some it works, for some not ... but js app doesn't have access to this <Travis> In the v6 proposal, there is actual no "state change" event. So, if no constraints are applied, there would be no notification of source state change (short of polling the state values). <Travis> (This is probably a missing piece in the v6 proposal). Josh_Soref: the path is disappearing from file content widget ... partially UX and partially security PeterThatcher: we have consensus on Settings ... does that include cropping? hta: we've gotten enough info that cropping -- how to adjust video to fit into boxes ... requires more discussion <dom> dom: for in-content device selection, couldn't browsers provide a tainted in-content widget that JavaScript couldn't access? similar to the file selector widget ekr: stop improving Chrome hta: that's life ... Queue is empty <ekr> it's too awesome already hta: show of hands ekr: voting by individual or company ... if i were recording, it would require me to record by company ... if knowing we have an ongoing discussion about: ... A. Selection ... B. Cropping ... C. we haven't discussed specific constraints ... do we have consensus to incorporate the framework of Constraint-Manipulation-After-Selection into the specification? <Travis> Travis: Yes, we should incorporate! ekr: if you believe we should incorporate this, at this time ... raise your hand ... consequences of yes/no? hta: if you say no, we have to discuss this more later ... and you have to propose changes ... and if not, your proposal for "deleted" ekr: if consensus is yes, a new ED will have this incorporated gmandyam: is this v6 compatible? hta: it will be self-consistent, and similar to v6 ... there were a number of minor tweaks between v6 and now ... v6 was impossible to implement cullen: burn presented a number of substantial differences to v6 burn: correct ... my goal as an editor in putting this in ... for areas where we don't have consensus ... i'll include a big note that we're still discussing the section ... if i miss something like that, please call me out on it ekr: just wanted to understand that dom: question about integrating it? ... or is it about consensus? Josh_Soref: which work-mode do we have? ... propose, reach consenus, commit ... or put in and then get consensus burn: minor things may not have a note ... major things will have a note until consensus justin: for things that need more discussion hta: let's do that after this ... if we try to do too many things at once, we won't do any well ... If we think the editing team should incorporate the proposal ... knowing open items [ 16 hands + Travis ] hta: if you oppose, raise your hand now [ 1 hand = justin ] hta: editing team will incorporate as stated ... chairs take an action item to ensure open items are discussed on list cullen: justin, what are your concerns? martin__: i'd like to get justin's opinion ... i'd like editors to mark open issues in the draft burn: that's the intent hta: yes justin: all things people want to do ... are all things because they're trying to do these things ... we have all these things ... but we might have many more as we try to do the rest hta: next step is incorporating burn: if it weren't for privacy, we'd be done a long time ago ekr: this spec is far in advance of anything impls have done ... we're going to be finding things we don't understand for quite some time ... because it's so far ahead of what we've done justin: getting stuff working is focus of work ... that list of constraints ... i'd like to make a white list of things we're going to care about burn: that's the next slide [ Candidate constraints/states/capabilities ] burn: as an editor, we haven't had specific ones to talk about ... we talk about specifics, it becomes general, it goes back and forth ... i'd like to with the group of people in the room ... not the same as list ... what i'm going to ask ... go through each one, and with a rough idea in your head of what it means ... is it something that might be good to have as a constraint ... to see if there's some breakingpoint ... a big groundswell, v. only a few ... it's input to the editor team for a small candidate initial set justin: for `now` as opposed to `ever` burn: what can be proposed to go into a first draft hta: we're delaying WebRTC until tomorrow morning cullen: take AspectRatio to near Zoom ... do sourceid, width, height, framerate, facingmode burn: i don't want each person to give me their list gmandyam: that list is my fault ... it wasn't meant to be for everywhere ... i was satisfied w/ the v6 list ... i hate reopening this debate hta: negotiation isn't between you and Travis isn't consensus of TF burn: Travis's reason for excluding AspectRatio wasn't valid dom: is this about settings or constraints? burn: yes, constraints=states=capabilities ... we're only talking about control cullen: how are we doing this? Josh_Soref: hands up, one at a time burn: +1 cullen: object to conclusion Josh_Soref: input to the editors burn: a proposal may go to the list, but no change [ sourceId 13 ] [ width 7 ] <dom> [so people understand what sourceId means in the context of control?] [ height 17 ] <dom> [but then that's selection, not control] [ aspectRatio 10 ] [ framerate 17 ] [ facingMode 12 ] [ zoom 1 ] <Travis> dom, sourceID can be applied as a constraint to a track with a different sourceId. That's an overconstrained situation :-) burn: would anyone actually raise their hand for anything after zoom [ Candidate constraints/states/capabilities ] <dom> Travis, but then that's not one useful to include in the list of control constraints [ sourceId 18 ] [ volume 16 ] [ gain 1 ] burn: thank you dom: Selection is ? <Travis> dom, The goal being we maintain parity between state, constraints, and capabilities. hta: action item on chairs to take to list [ Lunch ] <Travis> Folks should spend the time to read the v6 proposal in detail if they haven't had a chance. Thanks! Summary of Action Items [NEW] ACTION: ekr to draft text on no-permission-required for recording for inclusion in the mediastream-recording doc [recorded in [12]http://www.w3.org/2013/02/06-mediacap-minutes.html#action01 ] [NEW] ACTION: eric to draft text on no-permission-required for recording for inclusion in the mediastream-recording doc [recorded in [13]http://www.w3.org/2013/02/06-mediacap-minutes.html#action02 ] [End of minutes] __________________________________________________________ [1]W3C [1] http://www.w3.org/ Web Real-Time Communications Working Group F2F Meeting 07 Feb 2013 See also: [2]IRC log [2] http://www.w3.org/2013/02/07-webrtc-irc Attendees Present Regrets Chair hta, stefanh Scribe timeless Contents * [3]Topics 1. [4]Conclusions Media Cap 2. [5]MediaStream Cropping 3. [6]Error handling 4. [7]Data Channel API 5. [8]Stats * [9]Summary of Action Items __________________________________________________________ <trackbot> Date: 07 February 2013 <dom> stefanh: propose to adapt agenda in light of lack of progress on SDP interface yesterday <dom> ... instead of SDP agenda item, we'll look at video scaling and cropping in Media Capture Task Force <scribe> scribe: timeless Conclusions Media Cap [ Conclusions Media Cap discussions Boston ] [ hta reads slide ] hta: any comments/complaints? [ None ] cullen: thanks guys for doing this MediaStream Cropping [ MediaStream Cropping ] justin: this is a quick set of slides i put together ... for how a camera could stream could be provided in an aspect ratio not equal to native [ Problem statement ] <dom> [10]Justin's slides on aspect ratio [10] http://www.w3.org/wiki/images/7/76/Aspect_Ratio.pdf justin: most sensors and cameras are moving to 16:9 capture ... not all cameras are new ... quite a few are 4:3 ... it's hard for a full screen gui, especially full screen gui @ 16:9 ... <video/> will always pad to fit ... as opposed to crop to fit ... even if <video/> would crop to fit ... you wouldn't want to encode those bits that'd be cropped at the other end ... if you're doing local preview, you'd probably want to see the other side ... so people can see if they're out of from [ Solution: crop 4:3 to 16:9 ] justin: discard top/bottom (or left/right if needed) ... Flash's camera api would do this if you asked for an unsupported res ... instangram likes squares, so you could get this ... i'm only concerned w/ crop - letterboxing/pillarboxing are available from <video/> [ General Approach ] justin: opt in, new mandatory constraint ... you want this specific height, width ... width.max,min=640 ... height.max,min=360 ... allowCrop=true ... if camera supports resolution that falls within mandatory constraints, use that res ... if camera supports res that exceeds mandatory constraints, allowCrop=true to crop satisfy mandatory constraints dom: if this is mandatory ... how does it fit? ... it doesn't seem like a `selection` critera justin: it's more for having chosen a camera ... selecting settings later dom: so, it's a constraint that may be more for control than selection justin: i'm intending it for that, yes burn: this is control, not selection ... why mandatory { allowCrop } ... instead of mandatory/optional { cropIfNeeded } justin: for optional things <dom> [this seems to point to a growing divergence between constraints-for-selection and constraints-for-setting] justin: if these bounds are not hard bounds ... then it'd be a closest match burn: if it didn't get it, it didn't get it ... the point of optional constraints is notifying them if they can't be satisfied ... i'm proposing `cropIfNeeded` ... if you put it in mandatory, then it would crop if it needs to and can justin: do you have an example? ... i have a camera that can or can't crop in hardware burn: you might have other optional constraints relating to aspectRatio ... if you get those, you might want cropping, ... if not, you might not care JonLennox: comment, that might make this harder ... i'm often in a situation of VGA camera ... cropping `middle 2/3` makes it easier ... but on a mobile phone in landscape, cropping like that doesn't do what i want justin: i have another proposal ... PeterThatcher mentioned that problem on the ride over fluffy: trying to simplify things ... i assumed crop was always true ... if they asked for 16:9 and the input is 9:16, they get a postage stamp ... person rotates phone, problem goes away ... caution that adding complexity won't help people ... what happens if it's mandatory and it comes back `can't` hta: i kind of like the general approach ... but i dislike the examples ... i live in the world of resizable windows and interchangable cameras ... i want to write a UI once ... that will work when the user resizes the window ... and works w/ Barbie Doll camera ... and an HD camera ... i want 1 of 2 things ... either everything in picture field ... or i want picture to fill my field ... in neither situation should we stretch the picture ... remove the talk about min/max, and use the res. ... if you need to crop, crop justin: are you +1'ing fluffy ? ... if you ask for a res, and hta: when media is flowing ... browser knows where it's going ... and it knows where it's coming from ... it will have to do a fit operation ... this is an instruction to browser to crop instead of letterbox ... it doesn't instruct browser to aim for res timeless: he's complaining about the example justin: he's saying constraint pillarbox/crop hta: my real window size is 250:172 ... it should fit to that ... the best way to achieve that ... i think the browser should do what it needs to do stefanh: with <video/> it can adjust to video aspectRatio ... and there are examples of using <canvas/> to crop ... canvas uses lots of power ... and you lose advantage of sending non-displayed bits justin: i was going to add <canvas/> in an earlier slide ... if we had a Media Stream Processing API [ laughter ] justin: doing things in the GPU ... could be very efficient derf: i want to apply this to any video stream ... not just cameras ... according to settings, a <video/> can't have this applied [ Examples ] justin: this isn't the alternate proposal ... just how to use this constraint ... 720p sensor, want 4:3 width.max,min=960 height.max,min=720 allowCrop=true burn: on this topic of do we let browser do its wisest thing ... or let JS dev give a preference ... i don't think it makes it more complex to allow both cases ... it's quite productive to let the user choice ... crop, pad, best-guess justin: that's what i heard fluffy say burn: if there's a mismatch between <track> request and destination ... then you can express a preference as an app writer ... for cropping, or padding, or not caring justin: how do you recognize this mismatch when the destination is PeerConnection timeless: shouldn't PeerConnection be able to express a resolution? burn: there's no way to express PeerConnection res as you do for <video/> ... unless we want to add that [ Yes ] burn: that would address a lot of this ... because then it becomes parallel to <video/> fluffy: what i heard by hta was what i said ... a resize by Pretty, Ugly, ... ... for something that crosses the PeerConnection ... if we want that to propagate back to the origin ... the resize will have to go back across the PeerConnection ... involving a renegotiation ... complicated, but very nice ... if no one sets constraints ... but we need a way to say "i'm going to send 640x480" ... by creating a track with that ... and PeerConnection says "oh, that's what i'm sending" ... and then it doesn't negotiate justin: someone changes window size ... and that pushes it all the way back ... if someone changes window size and camera has to stop and restart, ... that yields wacky behavior fluffy: if camera is SIF camera ... and someone selects QCIF ... allow crop also ... have constraints? justin: proposing <dom> s/QSIF/QCIF justin: PeerConnection encoder can do cropping internally <dom> [11]CIF on wikipedia [11] http://en.wikipedia.org/wiki/Common_Intermediate_Format justin: the point is to give something to the encoder to do efficient scales ... for power of two scales ... for efficient matches of what output can display martin__: the more i think of this ... the more i think i have too many ways to do the same thing ... this seems to belong in the renderer ... we have <video/> it may pillarbox/letterbox/crop ... if we treat PeerConnection as renderer ... then i don't think we need constraints justin: assume a single camera ... no choosing ... for WYSIWYG ... do we have 2 mechanisms or 1 ? ... same way to ask camera for get X res, same way for X' res martin__: two ways ... @ <video/> ... and a way @ Track() ... which may or may not ... if each sink is the only place to set these properties ... the only constraint would be for selection ... i'm coming to the conclusion that that's pointless ... since in most cases it's orientation of camera that matters most hta: device selection is different topic ekr: whatever camera aspects ... if i plug in arbitrary res ... to <video> ... we expect something as reasonable as possible [ it will rescale in letterbox or pillarbox ] martin__: there's no way to control it justin: is it specified? stefanh: i think it's specified ... you can ask it to auto grow justin: you can never crop ekr: anything really sophisticated ... like Media Stream Processing API ... want something analog to <video/> ... i want to force fit this camera to this stream ... this is a stop-gap? justin: fixing a common problem w/ an efficient+cheap solution ekr: i could live w/ either @source or @sink ... i think i exploded at pushing across Connection stefanh: to martin__ <dom> [12]HTML5 spec says: "In the absence of style rules to the contrary, video content should be rendered inside the element's playback area such that the video content is shown centered in the playback area at the largest possible size that fits completely within it, with the video content's aspect ratio being preserved. Thus, if the aspect ratio of the playback area does not match [12] http://www.w3.org/html/wg/drafts/html/master/embedded-content-0.html#dom-video-videowidth <dom> the aspect ratio of the video, the video will be shown letterboxed or pillarboxed. Areas of the element's playback area that do not contain the video represent nothing." stefanh: to add Crop <media> element? martin__: yeah burn: i think simplest logical model for all of this ... is your Track is used to identify what's natively produced ... and Sink is what you expect to get out ... and influences where you munge ... so PeerConnection should have things like <video> ... i think constraints on Track still make sense ... so you can ask for High Res from Camera (which offers High Res and Low Res) ... so an app writer can say "these matter to me" ... so sink does processing ... and make it clear that's where it is justin: so that's a proposal to flag it ... at ... burn: allow back propagation or not is up to us ... we could allow settings on PeerConnection to allow settings to go across ... PeerConnection gets to decide how its properties as a sink are set justin: choosing how a device is opened up ... i argue this is analogous ... saying "give me X", or saying "give me X'" ... we could talk about feedback mechanism ... saying `camera can produce X res's` ... the reason allowCrop=true is there ... is for if you're in Portrait mode ... and you don't want to do cropping ... maybe apps don't ever do that ... give me these pixels no matter what ... give apps some flexibility burn: i'm not recommending taking away control from app writer ... question is where to do it ... on Sink or on Track PeterThatcher: is it safe to say we want to avoid reopening camera? ... does app tell browser this? justin: most flexible is to open @high res ... and crop/scale ... cameras have non-0 reset time ... and there's noticable artifact as they stop/restart camera PeterThatcher: would support controlling the crop at camera open time to avoid other constraints changing and causing re-open <dom> ScribeNick: dom justin: changing camera after feedback from sink martin__: the simple response to that is: don't do that ... if you're operating in the wrong mode for the display you're opearting in, you'll have to deal with it at some point ... This proposal is adding processing in the pipeline, which is what I wanted to avoid ... leaving constraints for elements that matter for the hardware ... anything related to processing should be left up to a processing api ... that api would provide a way to crop the video among other things justin: that would be great, but that expands the scope of needed work martin__: we would need to gather use cases, but that would be much cleaner than trying to push processing through constraints adambe: everytime we talk about constraints, it sounds like this could be for sources, tracks, or sinks ... that is very confusing ... it would be easier if only of these should be the focus of constraints burn: I have a similar comment; I think it should apply to two, not three adambe: we have the recorder as a sink, for peerconnection, stefanh proposed a transport handler to support setting priority etc ... for the MediaElement, you can control the width and height, but I don't know how you would control framerate, etc ... It seems more natural to let the sink in control Dan_Druta: I heard two aspects of applying constraints for: efficiency, and user experience ... we should be very clear about it when we talk about pros and cons juberti: re the sink driving, that's an important model, but I don't think we should force everyone under that model gmandyam: in the media capture call in December, we discussed @@@ ... you can set options for recording independently of the source ... it's not a real-time operation either, so with different considerations than peerconnection juberti: setting stuff at the source level doesn't invalidate options at the recording level burn: the current state of the settings proposal that we talked about, and the world we live in the current spec ... is that of source, tracks, and sinks, you can only set constraints on tracks and sinks ... peerconnection doesn't have the same kinf of settings capabilities ... there is a need to control what the source provides natively ... (if the device cheats e.g. with Apple, there is nothing we can do about it) ... we only two, but we have three places ... we could set constraints on sources, tracks, sinks ... the problem with constraints on peerconnection, you can have multiple tracks going in a peerconnection ... each of these tracks could have their own configuration ... this is why we've been talking about control on tracks ... the problem is to determine whether munging is done at the track abstraction level (and the source is where you set what you want), or at the sink abstraction level (and the track is where you set what you want) fluffy: +1 to burn ... we're lacking a mental model of all this ... there are sources, sinks, and tracks connect them together ... the problem with sinks is that we don't have much control about it <gmandyam> @@@ = At December MediaCap TF call, the participants agreed that width-height combo settings are required for the recording API. I proposed a possible setting to the mailing list. fluffy: as a result, tracks are preferred location for exercising control ... if you put constraint on tracks, that affect what's produced by that track ... for the sources, the general model is that they look at all the tracks they deliver, and they select the best setting that allow them to satisfy the tracks they're producing ... As a result, the mental model should be that Tracks are where we want to set constraints preferably juberti: that makes sense to me Josh_Soref: regarding the risk of latency attached to reopening a camera ... could this be something we can prevent using a constraint? ... (preventing from reopening a camera to adjust to a new setting) ... On the PeerConnection bit, I've always assumed @@@ stefanh: I think we need more control over transmission media ... we could apply that to tracks, or provide an api in peerconnection per-track juberti: I would prefer to apply it via peerconnection stefanh: I agree hta: the video element has the CSS3 property to define the cropping/fitting model martin__: I don't mind fluffy's suggestions, except that it means you now have two ways to control this for the video element ... I'm kind of tempted that we don't have this setting API for that ... I'll have to talk with Travis about that juberti: I have an alternate proposal <hta> (the css3 property is called object-fit - it took me a while to find it) juberti: where you define the maximal crop aspect ration -> [13]http://dev.w3.org/csswg/css3-images/#object-fit object-fit in CSS Image Values and Replaced Content Module Level 3 [13] http://dev.w3.org/csswg/css3-images/#object-fit Josh_Soref: in the recorder case, there isn't necessary a post-processing server somewhere else ... I can be doing local recording juberti: the cropAspectRatio is the process aspect ratio @@@: we would make it clear that process stuff would only be used for peerconnection cullen: do you have a slide on scaling? juberti: no cullen: I would love that we say we never scale up martin__: disagree, it has to happen in some cases ... I think we're talking about processing, and where that processing is applied stefanh: to avoid upscaling, we would have to change the mediaelement, since it scales up cullen: I was referring to not allowing to scale up at the track level, but I think the length of the line shows I'm losing on that one ekr: what does it mean to scale up a track? cullen: say you've acquired a sd camera ... and then you apply a constraint to the track to get a bigger resolution ekr: so I have a SD source that gets a SD track, plugging it into an HD video tag — that will be ugly ... if you were to apply to same setting to a peerconnection, the same thing would happen ... what you're saying that we should not allow to augment the resolution of a native track burn: I think you should have control ... but fluffy, you're badly schizophrenic ... if the constraint you set on the track is the output of the track ... and it's ok for munging to happen ... then scaling up has to be allowed to happen ... what we would want is to have constraints to limit the way the processing can be done ... and I think that's what justin is describing juberti: the goal is to provide a simpler way to do processing without providing a full-blown graph processing api ... I worry about an authoritarian model where @@@ ... I don't think it makes sense to send the quadruple amount of bits with no increased quality tim: afaict, no browser has implemented object-fit property except opera in the past two years -> [14]http://caniuse.com/object-fit object-fit in caniuse [14] http://caniuse.com/object-fit jan: we're a bit off-track with upscaling ... we've talked about source constraints and track constraints ... if we had peerconnection.maxWidth, peerConnection.maxHeight, this would cutshort this whole debate juberti: I think the peerconnection should have specific control on what each track aspect is ... I think we're pretty far afield to specific problem I'm trying to solve here martin__: if you're concerned about sending more bits over a PeerConnection than is necessary, ... having these settings on PeerConnection allows the PC to say "you've set this higher than what the source provides" Jan: I would modify my earlier proposal to have PeerConnection say the aspect ration of what the client will care about (e.g. 4x3) ... this is only for peerconnection juberti: not really, since object-fit is not implemented ... some cameras can do cropping at the hardware level, but not all of them ... this would let us solve the problem to these various cases ... it's not the general mediaprocessing api, but it has a better RoI hta: I think there is some level agreement that cropping needs to be able happen ... and the application should be able to have some control over it ... there is a fair bit of debate on what the model is on what constraints apply to, and how they show up at the destination of the track ... vs modification at the source of the track ... We kind of passed that boundary when we agreed to be able to produce multiple resolutions from a single source ... I would like to ask martin to write up his concerns on setting constraints on track ... and propose an alternate approach <hta> ACTION: Martin to write up his concerns with using constraints on track to manipulate stuff that might apply to sources and might apply to sinks [recorded in [15]http://www.w3.org/2013/02/07-webrtc-minutes.html#action01] <trackbot> Created ACTION-83 - Write up his concerns with using constraints on track to manipulate stuff that might apply to sources and might apply to sinks [on Martin Thomson - due 2013-02-14]. hta: I would like justin to take an action item to describe this constraint in a way that can be included in the document <scribe> ACTION: Justin to describe the constraint for aspect ratio for potential inclusion in the document [recorded in [16]http://www.w3.org/2013/02/07-webrtc-minutes.html#action02] <trackbot> Created ACTION-84 - Describe the constraint for aspect ratio for potential inclusion in the document [on Justin Uberti - due 2013-02-14]. Summary of Action Items [NEW] ACTION: Justin to describe the constraint for aspect ratio for potential inclusion in the document [recorded in [24]http://www.w3.org/2013/02/07-webrtc-minutes.html#action02] [NEW] ACTION: Martin to write up his concerns with using constraints on track to manipulate stuff that might apply to sources and might apply to sinks [recorded in [25]http://www.w3.org/2013/02/07-webrtc-minutes.html#action01] [End of Media Capture minutes] __________________________________________________________
Received on Friday, 8 February 2013 16:26:24 UTC