Media Capture Task Force Teleconference -- 23 Aug 2012

<trackbot> Date: 23 August 2012

<scribe> Scribe: Josh_Soref

Minutes Approval

stefanh: minutes approval...
... does anyone object to approving the minutes?

[ silence ]

RESOLUTION: Approve minutes from previous conference

Milestones

stefanh:

fluffy: there's a question about what is the basic functionality

hta: the closest thing we have to the thing on the list
... is you need to capture things from devices and pass them to media streams
... and you need to pick media streams apart and put them back together

fluffy: that's too vague
... you need to talk about whether you can take data in/out

hta: access to data was not in the proposal
... that's recording
... pushing without security is a no go
... we can't add security later

dom: an important thing about core/not-core
... is where we can get interop in the short term
... right now we have two browsers shipping with getUserMedia
... afaik the interop is limited to setting audio/video to true
... there isn't interop on assigning media to a <media> element
... i'm interested in hearing in anyone from Opera
... or Mozilla/Microsoft/Google
... about what they think we can get implemented in the coming months

TravisLeithead: for microsoft, i can't comment on upcoming feature set
... i'm in the game to make sure we have something implementable
... we've done prototyping in the past
... i'm interested in hearing what other implementers are up to
... and keep in mind what we think is best
... what we have out there is maleable

richt_: what we're shipping at the moment
... is extremely limited
... pretty much what you, dom, described
... we don't comment on upcoming releases
... our next stage is MediaStream
... there's no timeline or anything like that, that i can share
... the next level of the API is MediaStream
... and the next level after that...

hta: Google has getUserMedia, assignment to <media> via a url producing function
... we have not implemented constraints
... but we plan to

derf: we're similar to Google
... all of this is hidden behind a pref, because we don't have a security ui yet
... we're looking at constraints, but i don't think we'll have them in the next few months

dom: should we try to ship core getUserMedia that has just video:true/audio:true, and MediaStream assignment

fluffy: (Cisco), i don't think that meets the needs of the user community

dom: i think it would make it for a large number of users
... a video stream that you could play with
... and then we could extend that to add further functionalities
... this is a straw man suggestion
... the core that people are agreeing to
... it might be beneficial to the community

fluffy: i think it'll be hard to figure out how to add stuff in later
... i don't think you need to figure out the details
... but you need a structure

TravisLeithead: having a Recorder is pretty important to our scenarios

derf: fluffy, can you elaborate on what you need to build your applications?

fluffy: camera resolution control
... when you have an HD Camera
... and try to stream 20 channels at 720p
... you can't do that
... if i'm on a low bandwidth connection, some ability to pick something lower than that

burn: we have similar needs
... we need to be able to specify resolution
... the same thing we've been talking about
... minimum/maximum framerate
... if we want to build an application that isn't a toy

dom: if we agree on the core
... it just means the features would have to come later
... a good way to make progress
... on designing/priorities
... is to figure out which scenarios/parts of scenarios we want to enable

fluffy: i find this conversation surreal
... we did our design requirements
... our scenarios
... it seems like we're delaying things by proposing a second/update document

dom: thinking in terms of v1/v2 isn't the right way to think about it
... think about it as modules
... the reason we're having this discussion
... is if we keep adding features and features
... we're delaying interop
... on the core piece
... for as long as we need to discuss peripheral features
... the reason to discuss this is to try to get an interoperable portion out As Soon As Possible

fluffy: all of these features were mentioned as a requirement from day 1

burn: there are things that have been in the document for a while
... we talked about them from the very beginning

<dom> (it's not about being required or not, but being required for ASAP or for later)

burn: there have been things discussed more recently
... it seems weird to me to look
... -- the document has been stable for a month or two

stefanh: we have had some stuff in there for a long time
... but completely underspecified
... recording function

burn: not saying it was done
... just that something that's been in there
... that a good number of people are interested in
... we need to finish specifying
... i'm uncomfortable with yanking things out
... unless of course they're only a "heading"

derf: i want to second what stefanh said
... we have an impl in Chrome that throws Exceptions
... and Firefox calls the Error handler
... and there's assignment issues
... these are interop issues we should resolve
... i think we should prioritize those

<dom> +1 to derf

derf: i have no opinion on publishing

<fluffy> +1 derf on we need to fix all that

derf: but we need the underspecified things to be fully specified

Jim_Barnett: dom suggested splitting things out into modules
... it doesn't help us make a decision
... but it lets us release things independently

TravisLeithead: i'm not opposed to that idea

stefanh: i think it's a good idea

hta: seems to make sense
... that would mean that if we go ahead with a Recording interface
... we ask someone to put that together as a separate document

Jim_Barnett: i'd be happy to work on that document

dom: if by some miracle we make very quick progress on both modules
... nothing prevents us from moving them together on the REC track
... it doesn't delay things

<stefanh> @jim: thanks for offering!

dom: but by splitting out work
... we can guarantee core gets attention
... and the rest progresses if we have energy/support/resources
... i think it's a better way to make faster progress

Jim_Barnett: would the constraint language be a candidate for a separate module?

dom: there's been a lot of discussion about constraints and where they fit

<Zakim> dom, you wanted to ask about input from implementers schedule

fluffy: i have no objection to splitting the recording module
... but constraints as a hook is in core
... i think the specific constraint names can be defined elsewhere
... but the constraint api is core
... unless you have some other mechanism, it's hard to make it optional

dom: it's not the document that becomes the implementation

derf: HTML5 spec has a big thing about the <track> element in <video>
... we don't implement it
... even if you don't modularize things, we will

stefanh: seems we agreed to split out recording
... and we could integrate later depending on progress

Jim_Barnett: are we taking TravisLeithead's proposal as a basis?

TravisLeithead: i think my old proposal needs some work
... if we want to use that, i'd like to do a revised edition of that
... but if someone else would like to help
... i'm open to collaboration

Jim_Barnett: i'm glad to help

stefanh: ok, Jim_Barnett + TravisLeithead will start working on that document

Scenarios/Requirements Document

burn: one question came up is standardizing on "Recording" or "Media Capture" as a term
... there's a list of "sort of requirements"

<gmandyam> Will still image capture be part of the recording module spec?

burn: browser requires permissions
... do we want to take it out/or fold it into requirements

<TravisLeithead> The document we're discussing is here: http://dvcs.w3.org/hg/dap/raw-file/tip/media-stream-capture/scenarios.html

burn: in sec 5, we'll have a scenarios and apis, and that section could go away
... then do we want to go through stefanh 's comments

hta: Media Capture v. Recording
... we're best off if we use Recording as the act of putting media into a form that can be stored

<stefanh> +1 to hta

hta: whereas Media Capture, is anything that gets media into the computer

stefanh: i'd suggest adding a definitions section early

<TravisLeithead> Definitions: http://dvcs.w3.org/hg/dap/raw-file/tip/media-stream-capture/scenarios.html#concepts-and-definitions

Jim_Barnett: i'll do that
... we thought of a case of speech recognition
... is that recording or capture?

hta: i'd call it recording

TravisLeithead: i'd agree with that

<stefanh> me too!

Jim_Barnett: ok, great, i can use that as an example
... section 5?
... opinions?
... streams, reinitialization, stopping devices,
... how things might work, but not something for a final scenarios documnt

TravisLeithead: i agree
... a lot of those i put forth early on
... and we've clarified them
... some which we clearly understand, should be stated as such
... having things in this document is interesting from a historical document

Jim_Barnett: some of these we should figure out to remove
... and the rest, we can move into a bug tracker
... i can do this on the list
... the last one, section 2, has things that sound like requirements
... should those go out / get folded into reqiurements?

TravisLeithead: i'd like to see them folded into requirements
... or a discussion on the list to remove them

Jim_Barnett: i used them to create requirements

TravisLeithead: if there's duplicates, we can remove them

Jim_Barnett: i'll go through them
... and figure out which ones are orphans

hta: as soon as requirements in the same section covers everything in the list
... i think the list should go away

Jim_Barnett: i'm going to go through it and check
... whatever's left, we can discuss on the ML
... do we have time to go through stefanh 's comments?

stefanh: i think we have time
... but maybe it's best to give people time to read

Jim_Barnett: that's what i was thinking
... give people time to read and pursue on the list
... a lot of scenarios refer to previewing remote media
... if this is a requirements document for local media
... why are they here?

stefanh: they're UCs for

hta: maybe the scenarios should be removed/modified

Jim_Barnett: the scenarios are good
... but they have referenced to remote media

TravisLeithead: when i was putting this together
... i believe i was instructed to have it apply to both WebRTC's document and getUserMedia
... so the scenarios intermingle local+remote
... if we want to scope this down, there's some surgery required for the scenarios

Jim_Barnett: or could we point to the the webrtc doc?

hta: in an ideal world
... for each requirement here, the RTCWeb requirement number
... we should be able to say "this is already done over there"

Jim_Barnett: i can do that

hta: if they don't match up, we have a problem

TravisLeithead: is there another requirement for WebRTC
... not IETF RTCWeb

hta: we're using the RTCWeb requirements document as a joint requirement
... stefanh is the editor
... so claiming it's an IETF document
... it's hard to tell which is which

TravisLeithead: alright
... i think it's a good exercise to say which aren't handled by the gUM spec we're producing

Jim_Barnett: if there isn't a separate WebRTC requirement document
... where do i go?

hta: you reference the IETF document

stefanh: i'm not sure how lucky you'll be in that exercise
... i'm not sure you'll be able to find a 1:1 mapping

TravisLeithead: that was my concern
... but, go for it Jim_Barnett

Jim_Barnett: let me see what i can do
... maybe i refer to a couple of documents
... plus some scotch tape and glue over here

stefanh: let's see where it ends up
... i wouldn't object to keep the requirements there

Jim_Barnett: i think the idea is to add a pointer to the RTCWeb Requirements document

stefanh: if you need help, i can help

Jim_Barnett: ok, thanks

Constraints modification API proposal from Travis

<TravisLeithead> Revised Constraints API Proposal (start of thread): http://lists.w3.org/Archives/Public/public-media-capture/2012Aug/0066.html

hta: we had a very busy few hours on the ML
... somehow, TravisLeithead has now come with a revised proposal
... that seems to make my life much more complicated
... i'm not so happy
... one thing that has come up
... is whether tracks are mutable or immutable
... another question is how do we represent device manipulation
... "turn camera left", "turn on flash"
... one thing we _assumed_ in WebRTC land
... the WebRTC pattern seems to me
... lining up tracks/streams is a very heavyweight operation
... involving interchanges between parties
... to handle changes between parties
... that has to be modeled as a change to a track
... that has interesting consequences
... new signaling
... i'd like to see if we can get constraints that are changes to a track
... but i have a requirement to change the resolution of video while the video is playing
... that's a definite requirement (speaking as Google)

TravisLeithead: i think your thoughts are shared by the majority of those (everyone?) who replied to the proposal
... i'm not married to the concept of immutable tracks
... given the input you provided
... i'm more than happy to do a second version
... where constraints on tracks change the contents of the existing tracks
... i think it's been shown on the ML how swapping out tracks is too complicated
... i'm in agreement on that

hta: we have a point on the agenda about track controls
... richt_ do you see a relationship between that and this discussion?

TravisLeithead: i'm not sure what you're referring to
... can you be more specific

hta: i saw a proposal from you for a change-track
... i didn't quite understand where you were going with it
... if you had an idea of how we should approach constraints modification
... i'd appreciate it now

TravisLeithead: let me provide a high level proposal
... the proposal at its heart has a couple of fundamental changes
... the notable change is that it introduces a new list
... currently MediaStreams have a list that contains Tracks
... two lists, and Audio Tracks list and a Video Tracks list
... what i proposed is two additional lists
... present only on local media stream objects
... corresponding to the devices providing local media
... the reason i've done that is that the current lists
... can have things removed/inserted
... which makes it hard to identify which streams "I'm generating"
... and which are remote
... so these two additional lists are "my active local devices"
... if i call gUM, the selected source will be in that list
... on top of that, i've added an api to get current settings for a track
... this works with constraints
... and it gives a list of constraint ranges
... white balance -1...+1
... min-video x...y
... so you can extract those from one of these tracks
... and apply decision logic based on the end result
... if you want the lowest resolution from the camera
... if you're streaming a lot of video
... once you've figured this out
... you apply that to the track
... and that changes the track
... this is bolted onto the local track list
... so you can't use it on remote peer streams
... that's the high level proposal
... what we spoke of previously
... when you applied a constraint to a track
... it might cause a track to be stopped
... and a new track created
... the feedback i got
... was that was very heavy for UAs
... the general feedback was
... let's not dispose of the track and create a new track
... let's just modify the existing track
... i'm not opposed to that
... when you apply changes to a track
... perhaps an event is fired
... and that's it

richt_: one thing i wanted was to highlight differences
... between TravisLeithead 's proposal and our proposal
... one goal is to apply changes best effort
... after a resource is available
... our api isn't specific to a local video/camera stream track
... we're using instead is a concept of a locked object
... the interface will be available
... but it won't work for locked objects
... TravisLeithead 's proposal adds an extra list of objects
... objects with different methods
... what we're proposing
... is to add the methods to all objects
... which reduces the object count

hta: one thing i got from the discussion
... trying to set a constraint has to be able to fail
... people will try to apply constraints to an object that won't work
... for various reasons

burn: i want to jump in on constraints
... as one of the people who sent the first email on it
... constraints were originally designed for selection
... not control over tracks
... the selection piece was what a developer would want to do when they want to get media
... and they want media for their purposes
... constraints are "things i'd like to have"
... or "these i must have, or there's no point"
... once you have that
... it seems like it'd make sense to look at a different api
... to figure out detailed capabilities
... or figure out how to control it
... i just wanted to clarify that
... conceptually that doesn't make sense to me

adambe: i've thought about constraints / control
... i agree with hta that a control api is better suited as a set of methods
... than perhaps a huge constraint object
... jesup has proposed wanting to change a Track down the line
... when you have random access to it
... Do you really need a method for selecting a track in an advanced way
... and another api to change things
... only going with Constraints makes it possible to select
... but going with Change lets you do more modifications at a later point
... having an API to change streams
... might make sense

burn: adambe, i hear you
... a problem i see is that
... if i request video
... and i'm given a Video device
... some device / some file
... my App is given a video Track from some device
... i don't know that i can control it
... i may discover as an application developer
... that it's useless to me

adambe: in some scenarios where you give hard constraints
... you may not have a gUM dialog
... you may get a failure right away

<Zakim> Josh_Soref, you wanted to ask if an item will go away when the user closes the device source

adambe: with a modification api, you need the gUM and select dialog

Jim_Barnett: on some devices there may be lots of sources

fluffy: iPhone 2 cameras (front, back)
... you want hires
... if you can't do that in selection
... you'll probably get the wrong one
... and then when you change things
... it probably fails because you already have one
... i think you need control and constraints
... you need constraints even if you add control

burn: when we came up w/ the initial proposal
... it was because there were significant concerns about fingerprinting
... constraints
... solves that

adambe: i see a risk with a constraint based, not best effort
... you can probe the user's device
... by providing hard constraints
... getting immediate error

<Zakim> Josh_Soref, you wanted to note that there are spying/profiling concerns in immediate fail w/o user interaction

adambe: the dialog never appears
... i think that's the motivation TravisLeithead had

TravisLeithead: the main fingerprinting concern is "drive-by" fingerprinting
... we did some prototyping
... we considered firing early v. later
... because if it fired early, you'd be able to figure out if there's a camera at all
... with pure constraints, you can probe
... with best effort, you can't probe as much

adambe: get-capabilities after you've been granted access to a track
... offer/peerconnection
... there's a lot of information in that offer
... you're already leaking that
... we had a ML discussion, hta , I and DDD
... TravisLeithead: how does gUM look with this?

TravisLeithead: perhaps i didn't make that very clear
... the workflow is that
... each time the application calls gUM they get a new instance of a local MediaStream
... that local MediaStream will have at most one Video track and at most one Audio track
... if the application needs more devices, they call gUM again
... the success callback will have a second in this local streams list
... and you could combine them at that point
... the video/audio devices lists live on each stream
... they're essentially a singleton
... when you request the list, you're given a singleton
... one singleton for video, and one singleton for audio
... it doesn't matter which local media stream you query, the list is the same list
... if i turn off the camera, that source will be removed from that list

adambe: how do you specify audio only / video only?
... is it gUM?

TravisLeithead: i had no intention of removing options from gUM
... i think at a minimum audio:true/video:true is important

<burn> I find it interesting that the people most focused on the *use* of gUM want constraints, while the people most focused on the *implementation* of gUM do not. Which is more appropriate for our target audience?

adambe: that was my interpretation

TravisLeithead: i'm opposed to mandatory constraints
... those seem to be where problems lie
... if you want a bunch of constraint hints for the UA/user
... to help select the most appropriate camera
... perhaps constraints can be constrained to just have optional
... perhaps it doesn't make sense to have constraints + control api

<dom> (removing mandatory constraints — interesting idea)

hta: people definitely want to be able to change resolution

+1 to removing mandatory

burn: as someone who wants to build an app in HTML5 that behaves like a native app
... we have a trusted environment
... i don't want to show options
... i don't want to do any of that until i know i have something reasonable
... it seems like consumers want constraints
... but implementers do not [want hard constraints]

<dom> (I'll note that this distinction between "trusted environment" and "in-browser" is pretty fundamental in terms of API design; the Device APIs Working Group has suffered a very long time for trying to keep these two into a single API)

burn: i find that really interesting
... if we brought 20 app implementers
... they'd all want constraints

adambe: i don't think so
... if you don't know anything about the machine
... how would you create the constraint object
... it's much easier to get something, decide if it will work for you
... and then negotiate with the user
... we'll have very complex constraints object if you don't know anything about the machine you're dealing with
... you could dig down into details

hta: let's pay attn to the Queue

gmandyam: TravisLeithead, you may have answered this too
... one thing that concerned me
... the still image capture UC
... to set up the preview
... which would be different from a video conf app
... i know there may be problems w/ mandatory constraints
... what concerns me is modulating the constraints after the track is created
... upon creation of the track
... being able to apply constraints immediately
... could provide a better UE over a wider range of browsers

TravisLeithead: you're suggesting
... we have constraints before gUM
... we have a proposal to apply constraints to an existing track
... but you're proposing after gUM but before the track

gmandyam: i'm advocating for constraints on gUM
... i'm worried about video:true/audio:true as the only available constraints
... which wouldn't be desirable for certain applications
... i know hta talked about changing constraints later
... i'm worried about variability
... i want to make a 10fps hi-res preview mode
... i invoke that
... but i don't get that
... err i get that
... if i'm restricted to video:true/audio:true
... i can't modify the track until it's created

hta: i think we need to cut off soon

richt_: i want to talk about Why
... developers want constraints
... implementers don't want [hard] constraints
... what we're arguing over here
... it's vitriolic
... it's UE
... if you apply constraints up front
... it's before the prompt
... but what you do if you do it after the prompt
... the responsibility for rejecting users is left to the web app developer
... if you're up front rejecting someone
... that's an exclusive design philosophy
... that's not the web philosophy
... burn you mentioned trusted environments
... in those, you'll have no prompts
... there's no difference between early/later
... on the web, the responsibility lies with UAs
... it's about developers not wanting to take responsibility for not supporting their users

Josh_Soref: +1

hta: i think we'll conclude with that
... there's no consensus about changing gUM
... there seems to be consensus that changing tracks after creation is necessary

Recording API

TravisLeithead: i'd like to suggest we postpone this topic to a future Conference

gmandyam: is still image capture considered in the api?

TravisLeithead: i think richt_ 's proposal included it as a feature set of a video track
... i like that idea
... this is a swing thing
... it can go either way
... recording is about taking a stream and serializing to disk
... taking a picture is instantaneous operation
... it could be an operation

Jim_Barnett: it could be a separate module
... we could split it out

<burn> I disagree with you, richt, about there being no difference. I also disagree about not supporting users. Apps running on the iPhone are *very* different from ones running on the desktop. There is a great increase in the variety of devices, rather than a decrease, and apps will need to be customized for those devices, even if written in HTML. Developers *want* their apps to serve their users. Any other assumption in a largely free market makes no sense.

gmandyam: still image capture it could be locally stored, or in memory

richt_: the reason it's on video stream
... you can take a picture on anything
... a camera, or not a camera
... if it's a camera, you could use flash too
... but it's a function of having a video

gmandyam: there are two proposals for still image capture
... i'd propose that richt_ and i work on it
... but i don't want it to be deferred

hta: it seems fairly separable
... so we should treat it as a separate module
... get it speced, agreed to, get implementations, sure

gmandyam: i'll talk to rich offline

Rich’ proposal for advanced video track control

richt_: video stream, audio stream track
... tracks can be locked or not
... locked on a non local thing
... you're setting capabilities in lots of ways
... the proposal didn't go into detail, but you could see how it would work
... this is based on our user's requests
... focused on flash
... i think this is the simplest way to do it

hta: another approach is to go from media stream track to the underlying device

richt_: that'd be a smaller surface area
... which would be nice

hta: every time i see a class that has a four-piece name
... which kind of seems like a hierarchy
... it feels like there's something wrong there

richt_: i had the same hunch
... if we could query that object, and then transform it into a camera
... that'd be much nicer
... i think i've spent too much time in C++

Josh_Soref: that works in Mozilla w/ QueryInterface :)

richt_: the principle is sound, i think
... the reason it's simple
... is so we don't have to separate objects
... TravisLeithead 's thing introduces a risk of syncing objects

hta: i think i prefer simple device objects and simple track objects to complex track objects
... but the jury is still out

gmandyam: hta, you want the Still module to be separable

hta: i think so
... i can't tell for sure w/o a proposal

[ time check - time expired ]

richt_: returning tracks from gUM
... i guess there's a dependency either way

Close

stefanh: i'd like to summarize actions
... Jim_Barnett to look at scenarios/requirements
... and TravisLeithead was going to revise a recording proposal

<dom> ACTION: Travis to work with Jim on updated recording proposal [recorded in http://www.w3.org/2012/08/23-mediacap-minutes.html#action01]

<trackbot> Created ACTION-7 - Work with Jim on updated recording proposal [on Travis Leithead - due 2012-08-30].

TravisLeithead: i think people making proposals should get together
... it's good to have common ground

<dom> ACTION: James to look at scenarios/requirements [recorded in http://www.w3.org/2012/08/23-mediacap-minutes.html#action02]

<trackbot> Created ACTION-8 - Look at scenarios/requirements [on James Barnett - due 2012-08-30].

[ Adjourned ]

trackbot, end meeting

Media Capture Task Force Teleconference

23 Aug 2012

Attendees

Contents