W3C home > Mailing lists > Public > public-device-apis@w3.org > August 2012

Re: [html-media-capture] HTML Media Capture Last Call Comments

From: Anssi Kostiainen <anssi.kostiainen@nokia.com>
Date: Thu, 23 Aug 2012 14:57:20 +0300
Message-Id: <62DA163F-B738-4755-B044-AC34D452B789@nokia.com>
To: "public-device-apis@w3.org public-device-apis@w3.org" <public-device-apis@w3.org>
Hi All,

On 15.8.2012, at 22.55, ext Frederick.Hirsch@nokia.com wrote:

> I have entered the Last Call comments on HTML Media Capture [1] into the Last Call comment tracker tool, completing ACTION-562
> For the list see https://www.w3.org/2006/02/lc-comments-tracker/43696/WD-html-media-capture-20120712/

Thanks for populating the LC tracker! I assume we can address these comments on this thread, and when everyone's happy transfer the resolutions to the tracker.

> Entering the comments (separately for each comment [2]) , we now have 9 comments on HTML Media Capture. Last Call completed 9 August.
> Roughly the comments are as follows, to give some shorthand (see the last call tracker log for details and nuances)
> 1  Clarify capture vs accept and implications LC-2642 ( https://www.w3.org/2006/02/lc-comments-tracker/43696/WD-html-media-capture-20120712/2642 )


I was wondering, how is 'capture' different from 'accept'? It seems to me the
following are equivalent:

 capture      accept
 camera       image/*
 camcorder    video/*
 microphone   audio/*
 filesystem   */*



The following reply by Rich's seems to address fantasai's concerns:


I.e. no changes to the spec needed.

> 2 how to integrate images from alternative devices like scanners, print readers etc. Clarify if such requirements met. LC-2644  ( https://www.w3.org/2006/02/lc-comments-tracker/43696/WD-html-media-capture-20120712/2644 )

Robin articulated the scope of the spec nicely in his reply:


I believe we do not need to mention in the spec that revising the spec with additional inputs is possible, given that's the standard operating procedure of any lively web specification.

> 3 naming; prefer videocamera  instead of camcorder, naming more easily understood globally LC-2639  ( https://www.w3.org/2006/02/lc-comments-tracker/43696/WD-html-media-capture-20120712/2639 )


Why 'camcorder' instead of 'videocamera'? I think 'videocamera' would be more intuitive for non-native speakers.



Currently Android 4.0's stock browser and Chrome for Android implement the specification including the 'camcorder' keyword, see:


Given this, changing the 'camcorder' keyword at this stage is likely not a good idea.

> 4 add more examples, e.g. example of microphone access, and an  example of camcorder+audio input LC-2638  ( https://www.w3.org/2006/02/lc-comments-tracker/43696/WD-html-media-capture-20120712/2638 )


I would like also like to see an example of microphone access, and an example of camcorder+audio input.



I added two additional examples of a video and audio capture.

Re camcorder+audio. The accept attribute takes precedence over the capture attribute as per the spec. This means camcorder+audio would be the same as no capture attribute is present if the implementation's video camera control is unable to capture audio only. If the implementation's video camera control is able to capture audio only (in addition to video), then camcorder+audio and microphone+audio would yield similar results i.e. an audio file.

If someone has a concrete proposal how to improve the prose, please let us know. Currently, the normative prose is as follows:


The HTMLInputElement interface's accept attribute takes precedence over the capture attribute. That is, if the accept attribute's value is set to a MIME type that is not accepted in a defined capture state, the user agent must act as if there was no capture attribute.


Doug also mentions a potential use case for video-only capture:


The 'camcorder' keyword value may conflate video and audio; I can 
credibly see a use-case for video-only capture, and user expectation may 
be ambiguous if that's not called out explicitly when they are selecting 
their input (e.g. they may be unpleasantly surprised when they accept a 
video source and their audio is also captured).

The user may also wish to select a different microphone input than is 
bundled with the videocamera.

I suggest that the value of @capture should be a list of strings, not 
just a single value, i.e.
  <input type="file" accept="image/*" capture="camcorder,microphone">

This may result in a pair of source selections, sequentially selecting 
first the videocamera input, then the microphone input (or, depending on 
the UA and device, might have both in the same dialog... either way, it 
should be explicit).


I'd leave this up to the UA to be handled within the video camera control (the camera UI could have e.g. a "make silent / mute" control and a microphone selection control).

> 5 Specify security related to when mic is turned on etc LC-2636  ( https://www.w3.org/2006/02/lc-comments-tracker/43696/WD-html-media-capture-20120712/2636 )


* What happens if the requested device is not present on the system? People with disabilities may have a different sub-set of devices available than mainstream users. We suggest the specification state explicitly that the user agent should fall back to a standard file upload widget in this situation.
* The specification should make explicit statements about security expectations, e.g., requesting permission before turning the microphone on in order to capture from it. 



The first one is addressed by the following normative prose:


The capture attribute's invalid value default and missing value default is the File Upload state.


I added the following proposal by Frederick (thanks!) to the Security and privacy considerations section to address the latter comment:


The UA should not enable any device for media capture, such as a microphone or camera, until a user interaction giving implicit consent is completed. A user agent should also provide an indication when such an input device is enabled and make it possible to terminate such capture.


> 6 enable hint whether front-facing (user-facing) or other camera LC-2641  ( https://www.w3.org/2006/02/lc-comments-tracker/43696/WD-html-media-capture-20120712/2641 )


2. Many devices have more than one camera. Although it makes no sense to
insist on a given one (some systems, like super-fancy video conferencing
systems that used to be science fiction and are now feasible to construct
from spare parts, have a lot of cameras), The fact that in a huge number
of cases you can divide them into "facing the user" and "the rest",
suggests that it would be nice to at least allow the simple hint as an
author request.


Another similar comment:


That does not allow hinting the user-facing camera ought to be used
(typical phone these days has two cameras). (Not sure that use case is
addressed now though.)



We could assume one day there might be devices with an arbitrary number of cameras, so we should perhaps leave this up to the UA instead of hard-coding. I could think of a camera UI which allows user to seamlessly switch between cameras on the spot. However, if someone has a nice concrete proposal in mind, please let us know.

> 7 Specify default when requested device is not available LC-2635  ( https://www.w3.org/2006/02/lc-comments-tracker/43696/WD-html-media-capture-20120712/2635 )

Duplicate of LC-2636.

> 8 clarify use case of capturing video without audio or with alternate audio device than that bundled with video capture device; list of capture strings? LC-2637  ( https://www.w3.org/2006/02/lc-comments-tracker/43696/WD-html-media-capture-20120712/2637 )

Addressed by LC-2638.

> 9 Specify that capture attribute can be in markup, not just script LC-2640  ( https://www.w3.org/2006/02/lc-comments-tracker/43696/WD-html-media-capture-20120712/2640 )

The specification extends the input element's File Upload state, which is defined in HTML5. In order to be consistent with the HTML5 spec, this spec also specifies the extensions in terms of the DOM. Here's the relevant HTML5 reference:


Since DOM trees are used as the way to represent HTML documents when they are processed and presented by implementations (especially interactive implementations like Web browsers), this specification is mostly phrased in terms of DOM trees, instead of the markup described above.



> Thanks to all who gave comments. Now the WG needs to decide to resolve each issue (please refer to last call comment tracker for detailed comments).  Please discuss on list, including LC-# in the subject.

Thank you everyone for reviewing the specification and your comments! Thanks Frederick for populating the LC tracker. Let me know if I missed someone's comment.

The diff to the previous version of the spec [1] is at:


The Editor's Draft with the changes outlined in this mail is at:


Tracker, this completes ACTION-567.


> [1]  http://www.w3.org/TR/2012/WD-html-media-capture-20120712/
> [2] Doug gave a good example of how to make comments, making it easy to enter into tracker by separating them and indicating the applicable section, and category, see http://lists.w3.org/Archives/Public/public-device-apis/2012Jul/0031.html
Received on Thursday, 23 August 2012 11:57:50 UTC

This archive was generated by hypermail 2.3.1 : Monday, 23 October 2017 14:53:55 UTC