Media Capture Task Force Teleconference -- 07 Jun 2012

<trackbot> Date: 07 June 2012

<stefanh> proposed agenda at http://lists.w3.org/Archives/Public/public-media-capture/2012Jun/0020.html

Approve minutes

RESOLUTION: May 9 minutes approved

Requirements

hta: the big item for today is requirements
... is travis here?

Yang: I've posted something on the list

<dom> Help with capturing requirements from scenarios, from Harald
...: the requirements need more work

hta: we had an original set of requirements from the WebRTC working group
... we had a task to extract requiremets that would apply to this TF

<scribe> ScribeNick: Josh_Soref

hta: We have a set of UCs that we agreed upon
... and we're in the process of generating a set of requirements from those UCs
... we kind of implicitly understand what those requirements are
... but the process of actually writing the actual text of the requirements
... has taken much more time than expected

<Yang> now we get the requirement one by one UC, right?

hta: so what we need to discuss on this call
... is how we need to specify requirements
... and what the requirements are

Yang: this conference call will specify detailed requirements

hta: we've finished the work of generating scenarios

Yang: ok, i see
... i sent some requirements to the list

stefanh: i think there were at least 3 people who sent proposed requirements to the list
... i guess we should discuss if that's the right level of requirements
... with the right level of detail in them

hta: i think that once we have gathered requirements from all the scenarios
... we need to put them together and de-duplicate them
... and link them back to the scenarios
... and see what are the higher level abstractions from the requirements
... and what parts of the system aren't in our scope

Yang: i agree with that

<dom> [is Travis still offering to act as an editor for requirements?]

Yang: can we go through the scenario 2.5?

<dom> 2.5 Conference call product debate (multiple conversations and capture review)

hta: we can look at that yes

Yang: i posted a message to the list

<dom> Requirements extracted from scenario "conference call"

Yang: requirement to directly assess the video of a user
... without opening a new window

hta: we need to figure out what we're doing in this group
... and what is expected to be done elsewhere in the ecosystem
... you mentioned that a user could request recording from the secretary
... the process of requesting is outside the scope of this group
... but the process of recording a stream and sharing it with someone
... may be in scope for our WG

Yang: ok

hta: we should mention which things we expect to be handled
... by others
... [specifically which others]
... it's good to do this
... because sometimes people say "oh, we don't have plans to do that"

<hta> Something bad happened to my microphone. I'm back now.

burn: we will never have these problems with WebRTC, right?

[ laughter ]

hta: as I was saying
... a lot of these requirements revolve around Storing and Retrieving
... Audio and Video
... which probably means saving to File or equivalent

Yang: saving media to file or equivalent
... do we need to get permission from the source of the media?
... [ if it's being streamed from them to the side that wants to save it ]

jesup: I don't think that's something we should specify here
... certain countries/localities have restrictions about that
... but it's way too complex to insert in the protocol

anant: even if we specify that
... there's no way to enforce it
... we should leave it to the web app

[ General Agreement ]

hta: when you're talking about sending over the network
... it's also in the realm of the RTCWeb group
... not this WG

stefanh: how do we do this now?
... we can't go through the requirements in this meeting
... it will take too much time
... we should do this offline and ask Travis if he can integrate them in the document
... and if he doesn't have time, find someone else to do it

hta: we might want to ask if anyone knows they have time/could do it

Jim: i can
... but if travis is going to do it, i should communicate with him

hta: to us as chairs, it's not so important who does it, so much that it gets done

stefanh: i assume we have to check with travis
... and if he has limited time, we come back to Jim

<hta> ACTION: hta to ask travis if he can integrate collated requirements into his document, otherwise to Jim [recorded in http://www.w3.org/2012/06/07-mediacap-minutes.html#action01]

<trackbot> Created ACTION-4 - Ask travis if he can integrate collated requirements into his document, otherwise to Jim [on Harald Alvestrand - due 2012-06-14].

hta: we should also figure out if there are some requirements
... that would be too onerous to do in version one

stefanh: I added requirements about being able to pan audio in 3d
... but maybe that isn't required in the first phase
... as far as we know now, that would need something from the Audio WG

jesup: anything involving something like that is something that comes after Capture
... and I don't think it needs to be specified in this TF

Yang: for 3d, do you mean visualizing sound in 3D?

<dom> [sound spatialisation ]

Yang: i also agree realized audio would related to the Audio WG

hta: we had one volunteer to work with this
... i suggest we move on

Media Stream

Jim: I thought the idea of Media Streams assigned to certain media elements

<stefanh> Jim's proposal

Jim: there are certain restrictions
... and I thought we could produce a table
... relating to their values
... there are several questions that came up
... Media Elements are referenced by URL
... and a question that came up related to direct assignment
... in the current version, you must create a URI and pass in the URI
... another thing, when you create a URL, you can Revoke it later
... I presume that Revoking the URL
... doesn't change the <media source> field
... because changing the source element triggers a long process
... we need to figure out what happens in that case
... there was an original proposal from Opera that was linked
... in Seekable attribute, there
... are problems for non seekable streams
... and I had things return 0 to indicate that the stream couldn't be seeked
... another problem is that Media Streams don't have text tracks
... but they're optional

hta: I suspect they might have them in a year or two

Jim: if you had a real time speech recognition
... system, you could produce text

Yang: if a UA doesn't have a certain feature, then you don't use it
... seek time/seek rate

Jim: are you agreeing that Seekable start, end, time should be 0?

hta: what is the definition of current time?

Jim: it's supposed to be the current position in the stream

hta: if that has to increment, then seekable start+end should return current time
... if not, then 0

Jim: i think current time increases in real time linearly
... of course, you can't seek forward in this
... but, could you want to buffer?

<dom> HTML5 currentTime attribute on MediaElement

derf: i don't think we want that at all
... the thing on the media element
... should be what is playing in real time
... right now

stefanh: i agree with that

Jim: i agree, that would be separate

hta: i'd suggest we say explicitly that there is no buffer

Jim: i think that's one thing i have to add to this
... when you pause the stream, and then resume
... it doesn't buffer, and i need to add a statement on that

jesup: i agree
... on seekable, i think you might be less confused
... if you have start + end always return current time

Jim: you're saying seekable length should return 0

jesup: that's less likely to confuse implementations
... that use it to generate UI elements
... either that or you return an error
... you're talking about things that are effectively buggy in the first place
... the argument is equally valid

<Zakim> Josh_Soref, you wanted to say that throwing is more likely to break UI elements

hta: this table is a great table to have, the question now is
... where should we insert it in the spec?
... is it a new section?

stefanh: I think it's a new section

Jim: is it an appendix or something?

hta: I think it deserves a section
... a section that talks about interaction between MediaStream and <media>

Jim: ok

<dom> (the partial interface url {} could go under there)

hta: i suggest we charge one of our editors to work with Jim to insert this into the spec
... do we have a volunteer editor?

<hta> ACTION: burn to work with Jim to integrate Jim 's table into the spec [recorded in http://www.w3.org/2012/06/07-mediacap-minutes.html#action02]

<trackbot> Created ACTION-5 - Work with Jim to integrate Jim 's table into the spec [on Daniel Burnett - due 2012-06-14].

stefanh: Jim and burn will do this

Resource reservation

stefanh: anant, you made a new proposal and integrated it into the specification

anant: there are two points i added to the document
... they're non-normative
... first, we suggested

<stefanh> http://dev.w3.org/2011/webrtc/editor/getusermedia.html#implementation-suggestions

anant: when a resource has been used to provide to the given page
... that it should be marked as busy
... and subsequent requests within the page or elsewhere
... to assign the resource to an element
... should result in a busy
... and i followed up w/ a suggestion that the UA indicate to the User that
... the resource is busy and allow the user to reassign the resource to the new requester
... the second suggestion is for non hardware resources
... such as using a file picker to assign a stream
... we had a discussion at the last telco
... a media stream can have multiple tracks
... which thus have multiple hardware resources
... an app could prompt repeatedly for getUserMedia
... and then merge them
... after I sent that out
... I think we should define something
... around letting web pages determine how many audio/video sources it can have
... i wouldn't be comfortable revealing resolutions
... hta had a proposal that i liked
... specifying Max XXX to it

<dom> Grabbing exactly one camera, from Harald

adambe: how is this compatible with the rest of the constraint structure?
... if you ask for 2 cameras
... and have a constraint
... and one camera is high res, and one isn't

anant: I think that the constraints would apply to both
... if you as a web developer want to accept different constraints
... you should make two calls
... say a web developer has 2 <video> elements in the page

<dom> [that sounds like something we need to think more about, preferably with code examples]

anant: it's more intuitive for the user to assign in place
... the user clicks on a <video> tag, gets a door hanger
... picks a source
... and repeats for the other
... i think it's more intuitive than a window level door hanger

adambe: I agree
... if you have a front and a back camera
... you could have both streams

burn: that doesn't mean you need to request both at the same time

adambe: the scenario that anant described
... i'd like to have a way to select both cameras at the same time

burn: I have the same concern
... the moment we say that you can return multiple video tracks
... there will be pressure to expand constraints
... for individual things

hta: getUserMedia is an Async call
... what if you call getUserMedia twice before getting to a stable state?

adambe: I guess you will queue tasks to present UI

anant: what do you mean stable state?

hta: in JS, we have run-to-completion
... if we want to achieve two requests without bothering the user
... the UI might merge the two requests

adambe: and then you'd have two constraint sets

burn: hta is saying that as a UA optimization
... the UA could merge the User facing request

adambe: as a developer you kind of need to know that this could happen
... we could definitely do that

jesup: there could be some api issues, as to what completion would be called

anant: i think both would be called
... I think run-to-completion is pretty well understood
... by JS developers
... I think it's understood that the UA will handle both
... before it gets back to you

jesup: it definitely resolves the concern about complexity in constraints

derf: I like this a lot too

jesup: it would also allow the user to supply one and not the other

anant: and one success would be called and one failure

Zakim: who is speaking?

adambe: would you have two streams?

jesup: I think you'd have two streams and you'd have to merge them

anant: it's always going to be two streams
... and it's very easy to merge the user

adambe: the difference is just bugging the user once instead of twice

Yang: [garbled]

adambe: if you call getUserMedia() twice, they would be presented visually to the user as one call

<Yang> i support 1 call of getusermedia can return 2 stream at the same time

<scribe> ACTION: anant write text saying that it's possible to call getUserMedia() n times and the UA MAY coalesce those requests into a single UI [recorded in http://www.w3.org/2012/06/07-mediacap-minutes.html#action03]

<trackbot> Created ACTION-6 - Write text saying that it's possible to call getUserMedia() n times and the UA MAY coalesce those requests into a single UI [on Anant Narayanan - due 2012-06-14].

adambe: the UI element that the UA presents can coalesce the things faster than the User interacts with the UI element

dom: you're talking about one next to another
... there's a possibility that the dialog box has to be adjusted when the second request comes in later

Yang: I have to go, bye

<Yang> bye

adambe: the concern was that you'd get one dialog
... and you select one stream
... and then it would flicker

<dom> [in any case, that seems something we should leave to UA to figure out :) ]

burn: i have no concerns

<stefanh> +1 to Dom

anant: we also need to define the subsets constraints you're allowed to pass to getUserMedia
... for untrusted web pages

<dom> [is a stereo audio stream a single MediaStreamTrack?]

hta: now you're talking about capabilities, not constraints

burn: that matches something i proposed a while ago
... but we can delay that discussion to later

hta: i think we have other items in our agenda

Constraints

jesup: I had a proposal
... we talked about changes for getUserMedia
... we talked about reusing the constraint structure to pass in updated constraints
... to change the stream you already have
... without generating a new permission request
... and i wrote an email on this
... i won't read the whole email here
... there are bits relating to events bubbling up
... to PeerConnection or <media> or ....
... i want to see what people think about that?

burn: jesup, i'm skimming your email
... i don't understand the issue in general

<dom> MediaStreams and state changes, from Randell Jesup

burn: why do you need to change the parameters of a stream instead of requesting a new one
... i guess that's avoiding the permission check

jesup: there's a problem where a Stream is a remix
... and you might have lost track of the original request
... the original request might not be easily reachable
... Streams don't always originate in getUserMedia
... they can be PeerConnection
... so, we either need to make them consistent
... or ...
... as a consumer, i can't get the original parameters
... because i don't have the original request

burn: you're right, MediaStreams can be derived from anywhere
... constraints really apply to local media streams
... because that's what's returned from getUserMedia
... what are the UCs for changing the parameters of a derived MediaStream

jesup: changing the trade off between resolution and rate

<dom> (is this a use case or scenario we have captured already?)

hta: so, we have two needs
... to be able to query the present constraints/capabilities of an object
... the other... it's probably more reasonable to set constraints on a track rather than a stream
... because a track is more specific
... and also, changing capabilities has to be able to fail

jesup: yes

burn: i'd like to understand one other thing
... say you use getUserMedia
... and you get a Video Track
... which is copied into a Media Stream
... and then into another
... say you want to change parameters on that final Media Stream's Track
... would you want it to apply to ancestors, descendants and other distant relatives?

jesup: I think that is what would have to happen
... i think it has to bubble up
... i think it's a possibility
... bubbling up to the provider, whoever called getUserMedia
... that bit could change the original constraints
... it could modify it
... separating "i want this change"
... from "how do i make this change"
... how you do the change on a local connection
... might be different from a PeerConnection

burn: why is it that your constraints are more precise
... for the video element
... if you know the starting is 100x100 and the max is 1000x700

jesup: your constraint would probably be unconstrained
... but once it's connected to a playback element
... it might say i'm 352x388
... which it could do at a higher framerate than 1920x1800
... until you know what's using it
... and it could later rescale to 1000x800

stefanh: a common case is you don't specify a size when you do getUserMedia
... so there'd be a default stream

jesup: I agree
... and I think it might be the common case
... all of this is moving to the
... but video streams is something larger and more generic

<Zakim> Josh_Soref, you wanted to say screen orientation changes, or connecting an external projecter

stefanh: if you allow no constraints
... the browser can select anything
... if you set minimum/maximum, the browser has to operate within them

burn: that was the goal of constraints
... for the client to say "certain things are unacceptable to me"
... "aside from that, do something intelligent for me"
... one thing i heard pretty early on

<dom> "browser, make it so"

burn: was people didn't want "exactly this configuration"
... because it's hard to define that
... constraints is somewhere in between
... you can be as precise as you want
... you can let the browser pick something
... but it doesn't counter jesup 's UC

jesup: Constraints might be the mechanism for implementing the change request
... just talking about the infrastructure by which a request could pass from a consumer to a producer
... this could be for any information that a consumer would pass

stefanh: does the application need to be involved
... or can the browser handle it?

jesup: i think the browser could handle it by default
... unless the application takes control
... it would allow the application to assert itself
... for example the split screen case

Feedback to WebRTC/RTCWeb

jesup: Q: Is resolving questions involving Media Streams / Derived streams something that should be handled Here, There, or both?

hta: the concept of derived streams belongs to us

jesup: then, I guess that stays here

stefanh: what would apply to WebRTC/RTCWeb is... if the producer is on the other side of PeerConnection
... how do you pass along the request

jesup: yep

hta: anything else we need to talk to them(us) about?

<jesup> :-)

Open Items

<stefanh> http://www.w3.org/wiki/Media_Capture#Open_Items

stefanh: Image Capture APIs
... I don't think we ever concluded that discussion
... <iframe> behavior
... direct assignment, Jim mentioned that earlier
... we've identified a need to create Dummy tracks
... other sources, in the getUserMedia to select a file

anant: for a lot of these things, there's no conclusion
... are we still free to introduce to the draft?
... for parts where there's consensus, we should assign people to write them

hta: the possibility to select a non mic/camera
... we need a proposal
... how does that fit in with the permissions model
... other sources where we need permissions
... or don't need permissions
... we need a proposal

<dom> [anant's text on suggestions for implementations already suggest to make it possible to select a local file as a source for video]

hta: for video sourced from a file, we need the normal permission

dom: do you need getUserMedia for this?

<Zakim> Josh_Soref, you wanted to say that if the user selects a file in response to getUserMedia, you don't need further permissions

dom: anant 's text for getUserMedia should let user pick files from the local system
... maybe the text should mention screen sharing
... all of that seems to be something we can leave to the implementation at this stage

anant: i think we should let MediaStream() take a <canvas> or <video>
... that should be relatively simple to add to the spec
... and relatively simple to implement as well

Yang: how does getUserMedia indicate to the user that they can select a local file but not a camera

anant: i don't think we should let a web page say "you can only give me a file" or "you can only give me a camera"
... that should be the user's choice

dom: if you really want a video, you could use the file api
... <input type=file accept=video/*>
... the UC of getting access to a video file and making it a MediaStream
... can be done with other files

Yang: getUserMedia could have constraints that don't match the file

anant: a drawback of the geoLocation is that we don't allow users to lie

anant: you should be able to try out an app without giving access to your real camera
... just giving it a file instead
... to a developer, there's no difference

stefanh: we're running out of time
... being able to display audio level in audio tracks
... which is in the UC document

<Yang> getUserMedia ({video:file})??like that?

stefanh: to know if the right mic is connected

Josh_Soref: you could just pass that to the Audio WG

stefanh: editors ...

dom: they have converged on the Web Audio proposal

hta: they converged on something?
... there's no longer controversy?

stefanh: excellent
... we'd need a way to construct a MediaStream from video and canvas
... it's needed from day one

<anant> thanks Josh_Soref

stefanh: thanks everyone for coming
... and thanks Josh_Soref for scribing

Media Capture Task Force Teleconference

07 Jun 2012

Attendees

Contents