See also: IRC log
<scribe> Scribe: Josh_Soref
RESOLUTION: Minutes from 28 February 2012 call are approved
<anant> proposed way to integrate MediaStream into the getUserMedia doc: http://mozilla.github.com/webrtc-w3c/getusermedia.html
<richt> +1 to Anant's proposal.
anant: we'd like to propose integrating the change and publishing an editor's draft
... if there are no major comments, I'd like to move this into VC on Friday
adambe: we can always back out changes
anant: we don't gain anything by publishing sooner
Travis: publishing, or getting a new editor's draft?
anant: we'd like to have a new editor's draft
adambe: the reason I proposed that we should move it as soon as possible
... is that I don't think we need to introduce another step
... there's already the Editor's Draft before the Working Draft
Travis: this change involves moving part from the CVS repo to the Hg repo
adambe: we're currently working in Github and then publishing to CVS
stefanh: Anant, please post a link on the list with a link to the updated version
anant: how does Hg relate to the ED on www?
Travis: pushing to Hg on w3 updates the ED
adambe: where is the mercurial repository in the picture?
anant: getUserMedia is on Hg (dvcs.w3.org)
... WebRTC is in CVS (cvs.w3.org)
adambe: we have git (internal edits)
stefanh: for the time being it will probably stay in the same repo as WebRTC (CVS)
http://lists.w3.org/Archives/Public/public-media-capture/2012Apr/0027.html
burn: I put a summary at the top
... a number of comments on the list were relating to not understanding the structure
... I looked and realized that a number of things were relating to violating JavaScript syntax
... I'd much rather MediaStreamDeviceCapabilities be a dictionary
... but elements of an array can't be a dictionary
... so I don't know how to do that
... suggestions (offline) welcome
... I distinguished between MediaStream Constraints and XXW
... when you have an object that has only one key-value pair, and when you have an object with one-or-more key-value pairs
... looking at Constraints
... I didn't change the core algorithm
... I changed 2A and 3C1
... relating to what to do when a constraint is not supported by the browser
... 2A is in the Mandatory set
... if the author specified a Mandatory constraint and the browser doesn't support it
... then it must count as an error
... 3C1 is for Optional constraints
... the instructions say to skip it if the browser doesn't support it
... I updated the first example
... to use the new syntax
... I also added two more examples
... in the first example, I included mandatory and optional lists
... both parts are Sequences
... we could change mandatory could be a Set
... the browser is specified to honor all constraints for Mandatory, so order doesn't matter
... but for Optional constraints, order matters
... if you have conflicting optional constraints
... the algorithm requires satisfying the first one
... for simplicity in the mind of the author, both are sequences
... with ONE value permitted in each array element
... I'm open to improvement in how we structure each element
... the requirement stems from having key-value pairs
... the first example has Mandatory and Optional sections
[ burn reads through example ]
burn: the browser can fail to satisfy optional constraints
... the goal is to satisfy as many as it can in the order given
... for the second example
... what to do if you don't want to set any constraints
... I came up with a name
... video enum provide
... the registry provides for max, min, enum
... the principal is that there's a constraint for each one
... solely to say "I want a stream of that type returned"
... the example says "I must have a video"
... "audio would be nice"
anant: I like the approach in general
burn: I've only defined audio + video
... I haven't seen any others
anant: I'd propose maintaining the dictionary
<anant> {audio: true, video: true} <-- simple case
stefanh: didn't richt propose something like that?
<anant> {audio: {mandatory: [], optional:[]}, video: {mandatory: [], optional: []}} <-- constraint case
richt: it comes back to a more simple thing
... it's really the use cases
... which we haven't talked about at all
<richt> My feedback: http://lists.w3.org/Archives/Public/public-media-capture/2012Mar/0042.html
<richt> Travis' feedback: http://lists.w3.org/Archives/Public/public-media-capture/2012Mar/0072.html
richt: the use case here is user/environment facing cameras
burn: you and I have talked about it
... just because you don't believe it
... doesn't mean it isn't real
... I put in a proposal for a user/environment as a constraint
... I remember someone asking for other things
... maybe jesup asked about resolution?
richt: I'd really like to see the UCs
... on the list
... so we could see it and discuss it
anant: burn, stating that a web app needs to control resolution
... isn't a valid use case
... you need to frame it in terms of a real application someone wants to build
hta: there is a UC for RTCWeb needing [rec:F25] to control resolution
<anant> https://tools.ietf.org/html/draft-ietf-rtcweb-use-cases-and-requirements-06#section-5.3 that's the document
<anant> section 5.3 lists API requirements
<anant> 5.2 has browser requirements
<hta> Requirement F25 of draft-ietf-rtcweb-use-cases-and-requirements
Travis: is controlling resolution similar to min-height/max-height/min-width/max-width
anant: I think width/height are different resolution because pixel density is different
Travis: ah, pixel density
anant: we typically try to avoid specifying width/height in pixels
... typically you use `1em`
... given that devices have different dimensions
derf: we hard code pixel count for em in CSS
anant: but by using a different unit than pixels
... it gives us the freedom to vary
Josh_Soref: No it doesn't
derf: not at all
anant: we should use a different dimension for width-height
... and then use pixels for dpi
Travis: if we want media stream to interoperate with Canvas which deals in pixels
... then we need to have a way to translate
... I'd like to get back to burn
... on mandatory/optional for vidoe/audio
burn: the pixel dimension discussion is orthogonal
... wrt audio/video with mandatory/optional
... and a simplified form of true
... while audio/video are the two stream types we talk about today
... those are the types that my company cares about
... I've heard people say they want to support other media types in the future
... so I tried to come up with a data structure that doesn't constrain to audio/video
... only the registry constrains it
... if a new media type comes up, then you can just add items to the registry
... without reving the document
Travis: one of the UCs suggested a while ago
... was to record your screen
... while that technically falls under the category of Video
... it seems like you could define that as a new Provider
... [The Screen]
burn: that's possible
... you could go to the registry and do that
... maybe there are kinds of Text sent as something other than audio/video
... I don't know what the future holds
Travis: I think it's definitely good that we design this API so that it's easily extensible/not limiting
... I applaud that
anant: the reason you don't want it to be top level is to avoid revising the API spec
... it seems like we're working around revising the api
<jesup> media == time-labelled sampled data, typically from a sensor of some sort
anant: maybe we should make the first argument to getUserMedia be extensible
... so that we could extend things in the future
burn: I think we should decide how the registry should operate
... I was trying to make the registry easy
... if you and others would rather audio+video be top level, we could do it
anant: we should prioritize the API being better over curating the registry
burn: I agree with you
... in general I'm in favor of making things easier for the end user of the api
... vs the implementers
Travis: may I propose that we just add audio+video to the constraints dictionary
hta: can I propose to not do that
... having two pathways through the code makes it more complex
adambe: Travis is proposing something similar to the one I proposed
... you'd remove the audio-enum-provide
Travis: that's what I'm trying to suggest
... we should take this to the list
adambe: there's already a thread about the syntax of this
... paul neave
... burn pointed out that order is important
... we have a thread for this
... regarding anant 's proposal
... and revising the w3c spec
... I don't think there's a problem with adding screen at a later point
... adding stuff is pretty straightforward
<adambe> richt: this was the mail: http://lists.w3.org/Archives/Public/public-media-capture/2011Dec/0061.html
Travis: I agree
... and adding things would be backwards compatible
richt: to be clear on that
... we have implemented it
... but we can make it work
... we shouldn't be wed to the current bit
Travis: that's a very politically correct thing to say
burn: if the way to actually make this work is to have mandatory, optional, audio, video
... and that we can add others later
... maybe that's the way to go
stefanh: maybe that's the way to go
richt: there's concern about booleans
... dictionaries would be better
burn: closer to what anant proposed?
richt: yes
burn: I'm fine with that too
... maybe the thing to do is to write up both
<richt> Why booleans are bad in Web APIs: http://lists.w3.org/Archives/Public/public-webapps/2012JanMar/0349.html/
burn: and see what people think
<stefanh> ACTION: burn to write up the audio-video-mandatory proposals to the list [recorded in http://www.w3.org/2012/04/24-webrtc-minutes.html#action01]
<trackbot> Created ACTION-41 - Write up the audio-video-mandatory proposals to the list [on Daniel Burnett - due 2012-05-01].
burn: in the third example, audio+video are in mandatory
... the browser has to return audio+video, or it's an error
burn: I cleaned up the definition of the structure
... so it's now an array
... with one entry for each device/channel
... I'll let someone else suggest the appropriate term for that
... each entry contains an id + capabilities
... an id needs to be unique relative to the other ids in the same capabilities array
... "camera001" v. "camera002"
... the name must be composed of the high level media type
... Camera should have said Video
... followed by an opaque alphanumeric id
... so that you can distinguish by media type and per device
... but nothing else
... I used "supported" and "satisfiable"
... the description of the Trusted Scenario hasn't changed
... I added a bit for the Untrusted Scenario
... for the uses at my company, we're pretty much interested in the trusted scenario
... if people have suggestions for untrusted, please do
... my suggestion was just listing IDs, but no capabilties
... determining trust levels, I wrote TBD
... other TF members are better able to comment
anant: the getCapabilities call
... you defined it under navigator
... we can't have it there
... we could have navigator.media.getCapabilities
burn: good point
... I wasn't paying attention to that. you're absolutely right
... let's do that on the list
<Travis> navigator.getUserMediaCapabilities :) (bikeshedding)
burn: that's related to the discussion about how many different places should you be able to get capabilities/set constraints
... we should decide what these mean
anant: I agree
... your company is the trusted case
... at mozilla we're more interested in the untrusted case and expect it to be more common
burn: there reason it isn't there is because although I think it's important, I don't know how to do it right and want someone to do it right
hta: if you have a camera (with microphone), is it really two devices?
burn: I was, but I am open to that
... we have a challenge anyway
... earlier proposals didn't really distinguish
... they asked for "give me audio+video from the same device"
... this is where anant 's more general constraints coming in
... maybe I want audio+video from the same device
adambe: isn't that incompatible?
... where would those go?
burn: under General
hta: we need to be clear about whether a camera with audio is two or one device
adambe: it doesn't have to be two devices
... say I want to use my headset with a mic and my webcam
... you can't force people to force them to use the crappy mic
hta: you might want to know if the mic is next to the camera
... I expect CLUE people will want the 6-dimensional coordinates of the microphone
adambe: about the algorithm
... 1C the first pass is through all possible streams the browser could return
... I'm not sure how practical that is
[ burn points to sentence ]
burn: there may be more efficient ways to implement this
... it's easier to describe this algorithm as a process of elimination
... than a process of addition
... if the browser can take its 5 streams and know which satisfy the algorithm, that's fine
Travis: algorithm step 4
... will call success callback with the final set
... does each callback get a single track?
... or does the UA group them into as many compatible stream objects as possible?
burn: I wasn't clear about this
... for quite a while, there wasn't clear on what a Stream was v. Track v. Channel
... it depends on what we say getUserMedia should return
... if one media stream is returned containing multiple tracks
... then I'd say it has at most 1 audio and 1 video
... if the group says that they're separate, then I'm fine with that too
... when you get to this algorithm
... See 3
... 3D, select one stream from the candidate set and add it to the final set
... we add one set of video-data (if requested) and one set of audio-data (if requested)
... we're not merging
... [ burn tries to avoid saying Stream/Track ]
hta: I think it should be Track
burn: the closest to my understanding is Track
... so I should probably rewrite it
Travis: the algorithm will return at most one Track of audio and one Track of video
... and return it in some container
adambe: the output is what the user gives it
... it isn't a problem if
burn: the point of constraints is to say what you as an application author cares about
... and the browser selects one
adambe: the user needs to select a camera
... even if you have constraints, the user needs to select from the satisfiable list
hta: step 3D
... select one stream [s.b track]
... this step may be automatic or involve user interaction
jesup: if these algorithms are limited to returning a single video/audio track
... how do we handle the UCs that handle multiple synchronized cameras?
... this algorithm concerns me
burn: that's a great question
... there's a separate thread on the list about that
... whether it should return one track per media type
... or should return multiple ones
... the algorithm is written for the single case
... it's possible to extend it for multiple
... but I'd like to scope that to a distinct discussion
... I agree jesup, this algorithm doesn't cover that
stefanh: burn, there's a step where the user would be involved
burn: I don't know if we've been completely clear about how permissions work
... or where user involvement occurs
... I'm open to suggestions
... maybe hta's 3D is correct
adambe: it needs to be before the success callback
burn: I don't have a particular opinion on that
... I'm happy to have other people duke it out
... and we can add it explicitly if it's necessary
richt: I believe it's covered in the existing algorithm
... it says the user must select something
... I need to go check
burn: the last bit is registration
... audio/video were the only two listed as required
... but if we change the structure,...
... I put example definitions for width/height/direction
... those are _examples_
... it is up to this group to decide
<richt> fyi: Step 10 in existing getUserMedia algorithm is the point that permissions occur: http://dev.w3.org/2011/webrtc/editor/getusermedia.html#navigatorusermedia
burn: on constraints
... I'd almost rather other people suggest them
... it seems whatever I propose, some want and some oppose
... but I think we can take that to the list
stefanh: thanks burn
... it seems we're discussing details
... are there objections?
richt: I'm objecting
... I want to see some UCs
... that's the premise
... I think it's very important
stefanh: we have the Scenarios document that Travis wrote
richt: I looked through the requirements from Travis's document, and I don't think anything needs this
stefanh: hta pointed out F25
richt: I think F24
... was very vague
<richt> The WebRTC requirement says: "The browser MUST be able to take advantage of capabilities to prioritize voice and video appropriately." That doesn't necessarily pre-ordain constraints are required IMO.
<richt> F24 in https://tools.ietf.org/html/draft-ietf-rtcweb-use-cases-and-requirements-06
Travis: I've been planning to update the document
... I anticipate in the next several weeks to be able to put forth an update
... if anyone has other things they want to see added
... please send them my way
<stefanh> ACTION: stefanh to check with Travis on updating the scenarios document requirements portion in about 4 weeks [recorded in http://www.w3.org/2012/04/24-webrtc-minutes.html#action02]
<trackbot> Created ACTION-42 - Check with Travis on updating the scenarios document requirements portion in about 4 weeks [on Stefan HÃ¥kansson - due 2012-05-01].
stefanh: maybe hta and I should give help
Travis: I'm certainly open to that
... we should keep in mind what happens when getUserMedia is called a second/third time with different constraints
... it came up while microsoft was investigating this feature
burn: that's a good point
... it's good to know when you're requesting new streams, and when you're requesting replacement streams
anant: is there a reason to distinguish?
... as opposed to you just closing a stream and requesting a new one
jesup: replacing may cause it to choose a different camera
... or querying the user may be a problem
... or interupting the stream may cause a glitch
... I'm very much in favor of an API which allows for requesting modifications to existing streams
... I made comments on the list
... we talked about using the constraint language
... for modifying an existing request
anant: I agree there are valid UCs for replacement v. new
... I'd like to make a straw man that we don't need to use getUserMedia
... for replacement
... I think there's a set of constraints the browser could change automatically
<jesup> anant: agree
anant: there are some which the browser would need to do with User interaction
... for Modification, we could have the API be on the Stream object
Travis: I'll second that
... and suggest we describe what the action is for the affected stream
hta: I also like the idea of modifying the capabilities of an existing stream
... because it also maps well to modifying remote streams
richt: I also agree
... I was going to suggest the same thing as Anant. That changing the capabilities of an existing stream should happen on the LocalMediaStream object.
[ Time check ]
stefanh: we should arrange for a new call
... when should the next call be?
... after the WebApps F2F meeting?
... [ the first week of May ]
hta: at least 2 weeks from now
stefanh: yes
burn: that should be easy if we're just continuing the agenda
stefanh: hta and I will put up a doodle
hta: from the week of May 6th to the 12th
stefanh: yeah
... thanks everyone for joining
[ Thanks to the scribe for scribing ]
[ Adjourned ]
trackbot, end meeting