Re: Scenarios doc updated from Efren CD on 2012-01-17 (public-media-capture@w3.org from January 2012)

From: Efren CD <efrencd@gmail.com>
Date: Tue, 17 Jan 2012 17:58:14 +0000
To: public-media-capture@w3.org
Message-ID: <CAJ+mEdStQuxaa23hXyk6PUCtnYKn2Q8njcPx9q=fXKgZRbfZiw@mail.gmail.com>
Hello all. I'm new to the group but i'd like to add some words on this:

I think instead of using the word "Webcam" in the MediaStream Capture
Scenarios, it would be better to use "Video Capture Device" as it is more
generic. Some users could be interested in using other kind of Video
Capture devices like video capture cards (professional and not professional
ones) to feed video into the browser. In this context, it is completely
necessary that the browser gives the user an option to change the Video
Capture Device settings (as pointed in the  Find the ball assignment
Scenario) This way the user could for example choose the video capture mode
(NTCS, PAL, HD modes), hardware Input audio/video levels, etc. Flash and
Silverlight can capture media form Video Capture Devices but they do not
show any option to change the device settings, so whatever the default
video capture mode is, it remains, and cannot be changed from the
application. This is ugly.

Another interesting Scenario is the one in the Coliseum and the following
requirement: "Local video previews from two separate webcams
simultaneously". And I ask: Why only two? I think the number of possible
capture devices usable should not be restricted.

I can imagine a new Scenario in which someone codes a Web based video
surveillance system able to get input from several (remote and local) video
capture devices, display them onscreen, start/stop recording based on
motion detection, send remote alerts, etc.


Thank you guys. This is a very interesting group.

Best regards. Efren.

On 17 January 2012 17:56, Efren CD <efrencd@gmail.com> wrote:

> I'm sorry Stefan, I sent last email directly to you. My intention was to
> post it to the MediaCapture Mailing list. My mistake.
>
> I'm going to repost it to the group.
>
>
>
> On 17 January 2012 17:55, Efren CD <efrencd@gmail.com> wrote:
>
>> Hello all. I'm new to the group but i'd like to add some words on this:
>>
>> I think instead of using the word "Webcam" in the MediaStream Capture
>> Scenarios, it would be better to use "Video Capture Device" as it is more
>> generic. Some users could be interested in using other kind of Video
>> Capture devices like video capture cards (professional and not professional
>> ones) to feed video into the browser. In this context, it is completely
>> necessary that the browser gives the user an option to change the Video
>> Capture Device settings (as pointed in the  Find the ball assignment
>> Scenario) This way the user could for example choose the video capture mode
>> (NTCS, PAL, HD modes), hardware Input audio/video levels, etc. Flash and
>> Silverlight can capture media form Video Capture Devices but they do not
>> show any option to change the device settings, so whatever the default
>> video capture mode is, it remains, and cannot be changed from the
>> application. This is ugly.
>>
>> Another interesting Scenario is the one in the Coliseum and the following
>> requirement: "Local video previews from two separate webcams
>> simultaneously". And I ask: Why only two? I think the number of possible
>> capture devices usable should not be restricted.
>>
>> I can imagine a new Scenario in which someone codes a Web based video
>> surveillance system able to get input from several (remote and local) video
>> capture devices, display them onscreen, start/stop recording based on
>> motion detection, send remote alerts, etc.
>>
>>
>> Thank you guys. This is a very interesting group.
>>
>> Best regards. Efren.
>>
>> On 17 January 2012 08:15, Stefan Hakansson LK <
>> stefan.lk.hakansson@ericsson.com> wrote:
>>
>>> On 01/16/2012 09:08 PM, Travis Leithead wrote:
>>>
>>>> Great feedback.
>>>>
>>>> A few thoughts below. I'll get to work on incorporating this feedback
>>>> today or tomorrow.
>>>>
>>>> -Travis
>>>>
>>>>  -----Original Message----- From: Stefan Hakansson LK
>>>>> [mailto:stefan.lk.hakansson@**ericsson.com<stefan.lk.hakansson@ericsson.com>
>>>>> ]
>>>>>
>>>>> On scenarios: ============= * It is not stated what should happen
>>>>> when the browser tab that has been allowed to capture is not in
>>>>> focus. I bring this up since one of the Speech JavaScript API
>>>>> Specifications
>>>>> (http://lists.w3.org/Archives/**Public/public-webapps/**
>>>>> 2011OctDec/att-1696/speechapi.**html#security<http://lists.w3.org/Archives/Public/public-webapps/2011OctDec/att-1696/speechapi.html#security>
>>>>> )
>>>>> proposed that capture should stop in such cases, while that kind
>>>>> of behavior is not at all what you would want in e.g. a
>>>>> conferencing scenario (where you would often like to use another
>>>>> browser tab to check out things while being in the conference)
>>>>>
>>>>
>>>> This indeed is an interesting scenario. I would love to hear other's
>>>> thoughts on this. We'll walk a fine line between user privacy (e.g.,
>>>> seems like a bad idea to just leave the user's camera on when they
>>>> switch away from an active browser tab), and usability (e.g., in
>>>> conferencing scenarios). Perhaps the use of a PeerConnection can
>>>> trigger some state change in implementations such that they persist
>>>> after switching tabs?
>>>>
>>>
>>> I think Randell provided good feedback here - I have little to add.
>>>
>>>
>>>
>>>>
>>>>
>>>>  * There is no scenario that uses the screen of the device as input.
>>>>> To give some background, screen sharing is a use case for webrtc
>>>>> (use case 4.7.2 in
>>>>> http://datatracker.ietf.org/**doc/draft-ietf-rtcweb-use-**cases-and-<http://datatracker.ietf.org/doc/draft-ietf-rtcweb-use-cases-and->
>>>>> requirements/?include_text=1). Some experimentation has been
>>>>> carried out, using getUserMedia, and it was found being a viable
>>>>> way forward (using "screen" as hint). I don't know how this should
>>>>> be handled vis a vi the Media Capture TF, but it seems to me that
>>>>> it should be included. What do others think?
>>>>>
>>>>
>>>> Good scenario, I'll see if I can incorporate it into an existing
>>>> scenario as a variation; if not, I'll spin up another scenario for
>>>> it.
>>>>
>>>
>>> Dito.
>>>
>>>
>>>
>>>>
>>>>
>>>>  * There is no scenario with several cam's used in parallel, in
>>>>> section 4.2.9 of
>>>>> http://datatracker.ietf.org/**doc/draft-ietf-rtcweb-use-**cases-and-<http://datatracker.ietf.org/doc/draft-ietf-rtcweb-use-cases-and->
>>>>> requirements/?include_text=1 two cameras are used in parallel for
>>>>> part of the session. Perhaps this is too constrained, but I think
>>>>> something where more than one is used should be in the doc.
>>>>>
>>>>
>>>> Scenario 2.4 (Video diary at the Coliseum) uses two webcams in
>>>> parallel, but only records from one of them at a time. How would you
>>>> suggest that be changed?
>>>>
>>>
>>> Would it not be possible to use both at the same time while recording,
>>> capturing a video of both himself and the Coliseum? When "playing/viewing"
>>> the diary, the layout could be such that Albert's head is overlayed as a
>>> small video in the corner of the main (showing Coliseum) for parts of the
>>> sequence.
>>>
>>> Or incorporate the "hockey" use case. (Randell had further suggestions).
>>>
>>>
>>>
>>>>
>>>>  * Other comments: ================= * Section 5.3.1. About audio
>>>>> preview, I don't really understand all of that. What we (in our
>>>>> prototyping) have been using for pre-view is a video element which
>>>>> we set up as "muted". Would that not be the normal way to do it?
>>>>> And of course you should be able to add a meter of some kind
>>>>> indicating the levels you get from the mic (to be able to adjust).
>>>>>
>>>>
>>>> I'm calling out the problem of previewing your own local audio input
>>>> (not hearing audio from a PeerConnection or other source). I suspect
>>>> (but have not confirmed), that this can be a major problem. First
>>>> off, none of the six scenarios require this capability The concern I
>>>> see is that in most simple use cases, the developer will request both
>>>> audio and video capabilities from getUserMedia. Then they'll attach
>>>> that MediaStream to a video element. Let's say the device is a
>>>> laptop. The laptop's array microphone will be enabled. When the user
>>>> speaks, the array-mic will send the signal through to the PC's
>>>> speakers which will amplify and blast that sound back to the user,
>>>> some of which will get picked up by the array-mic again and be sent
>>>> out the speakers... I'm somewhat familiar with this based on some
>>>> amateur work I've done in live sound performances.
>>>>
>>>
>>> This is indeed a problem; but I saw it like this: first of all we could
>>> have code examples in the spec that mute the audio for the self view (with
>>> some comment on why), and secondly, would not the application developer
>>> detect this problem when doing the very first basic test? And then fix it
>>> long before the app is used by anyone else.
>>>
>>>
>>>
>>>> I could be making this into a bigger problem that it is however.
>>>> Implementor feedback testing on a variety of devices would be
>>>> helpful.
>>>>
>>>>
>>>>
>>>>  * Again, 5.3.1., I don't understand why you would limit to display
>>>>> one version of the captured video. Sure, that is the most natural
>>>>> way, but should we not let that be up to the application?
>>>>>
>>>>
>>>> I tend to agree. This is more of an opportunity for an implementation
>>>> to optimize if desired, not something necessarily for the spec to
>>>> mandate.
>>>>
>>>>
>>>>  * 5.3.1: I tend to think this is out of scope for the TF (just as
>>>>> is said for Pre-processing). There are already way to do pre-view
>>>>> in the toolbox.
>>>>>
>>>>
>>>> Which point is this referring to specifically?
>>>>
>>>
>>> Sometimes I'm very unclear :-(. What I meant was that perhaps the TF
>>> does not have to deal with pre-viewing at all since there are already tools
>>> available (audio/video/media elements among others) that can be used.
>>>
>>>
>>>
>>>>
>>>>  * 5.5, 5.6: I think a lot of the tools under 5.6.1 are actually
>>>>> usable also for pre-processing. And note that the Audio WG now has
>>>>> _three_ documents in the making, in addition to the "Web Audio API"
>>>>> there is now the "Audio Processing API" and the "MediaStream
>>>>> Processing API"! I have no clue which will prevail, but for the
>>>>> sake of completeness perhaps all should be listed?
>>>>>
>>>>
>>>> I'll note this, and see about linking to those other specs.
>>>>
>>>>
>>>>  * 5.7, 5.8: Here we have some challenges! Re. 5.7, it may be
>>>>> difficult to select the right devices without involving the user.
>>>>> The app may ask for a specific video, and the user must select the
>>>>> camera that makes most sense.
>>>>>
>>>>> * 5.9: I think we should allow for several active capturing devices
>>>>> in parallel.
>>>>>
>>>>
>>>> Sure; it just may not be physically possible in some devices. As long
>>>> as we define appropriate error conditions then it's fine.
>>>>
>>>>
>>>>
>>>
>>>
>>
>
Received on Wednesday, 18 January 2012 15:49:43 UTC