W3C home > Mailing lists > Public > public-media-capture@w3.org > September 2012

RE: Settings retrieval/application API Proposal (formerly: constraint modification API v3)

From: Jim Barnett <Jim.Barnett@genesyslab.com>
Date: Thu, 13 Sep 2012 08:20:30 -0700
Message-ID: <E17CAD772E76C742B645BD4DC602CD8106B5F3C3@NAHALD.us.int.genesyslab.com>
To: <Frederick.Hirsch@nokia.com>, <jsoref@rim.com>
Cc: <public-media-capture@w3.org>
We clearly cannot try to standardize the behavior of face detection
algorithms, or any other feature of the underlying devices.  However, in
the case of a setting that has max and min values, the test could set
the value to the max and then to the min, and the tester would verify by
inspection that the corresponding feature of the underlying device
(camera, etc.) had been set to its largest and smallest values.  (If for
a certain device max==min, there would be no detectable change, but the
_UA_ would still pass the test.)  

Another consideration is how the API should handle the case of a missing
capability.  For the sake of an example, suppose that we do define a
capability for face recognition.  How should the API handle a device
that doesn't support it?  How does it report that the feature is
missing?  Again, it should be possible for a UA to support the face
recognition capability even if it is running with a camera that doesn't
have that capability (the UA must report that the capability is not
available and handle attempts to set it gracefully.)  

Finally, even if we decide that a certain set of capabilities is common
enough to be included in the spec, we also have to consider how the
UA/API should handle device-specific capabilities that aren't in the
spec.  

- Jim

-----Original Message-----
From: Frederick.Hirsch@nokia.com [mailto:Frederick.Hirsch@nokia.com] 
Sent: Thursday, September 13, 2012 9:45 AM
To: jsoref@rim.com
Cc: Frederick.Hirsch@nokia.com; public-media-capture@w3.org
Subject: Re: Settings retrieval/application API Proposal (formerly:
constraint modification API v3)

One of the requirements for a specification to move to Recommendation in
the W3C is for interop testing to be completed on all the features.

Maybe this has already been addressed, but what does it mean to interop
capabilities? 

Does it mean to simply be able to set and obtain back setting values? In
that case whether or not face detection works at all is irrelevant, as
the test is about whether the capability value can be set and obtained
(not whether there is truly an implementation behind it or whether such
implementations are consistent).

There are a number of arguments against considering the algorithms used
to implement such capabilities, including the need for algorithm agility
and IPR to mention two.

A way forward might be to consider how testing is to be done, and the
effort and approach needed.

regards, Frederick

Frederick Hirsch
Nokia



On Sep 11, 2012, at 5:52 PM, ext Josh Soref wrote:

> Giridhar wrote:
>> Regarding "do whatever the camera manufacturer thinks is appropriate 
>> for a function with this name", we do have precedence for providing 
>> implementation flexibility in the W3C.
> 
>> For instance, the Geolocation API poses no detailed requirements on 
>> the underlying platform as to how to interpret the setting
enableHighAccuracy.
> 
> The Geolocation API is generally understood to be a pretty terrible
API in a number of ways. Referencing it isn't a good start to any
argument.
> 
> Another example of a crappy API is the DeviceOrientation API which was
published by the same WG. It basically has ZERO interop.
> 
>> Maybe a more constructive way forward (as opposed to dismissing what 
>> I've proposed summarily) would be to at least determine those 
>> settings that would not derail the standardization effort
significantly.
> 
> So, "face detection" is an interesting thing. However, while it /may/
be possible to get interop in the form of "returns a rectangle that may
have a fuzzy face, or an animal, or a statue, or a sculpture, or
something that isn't remotely like a face", I'm not quite sure we're
likely to see better interop than that. And I'm really unsure we'd be
able to get interop on "how much padding will be included in the
detected face boundaries".
> 
> An interesting question is can a face region have multiple faces?
> Would it be legal to return the entire picture's dimensions (for the
case of a family/team picture, as opposed to an actual badge-photo)?
> 
> Calling those QoI distinctions is pretty problematic. If half of the
implementations do it one way, and the other half does it the other way,
you really don't have interop, and Cordova/jQuery and similar groups
will be forced to just write their own shims which do detection they
want manually (and more accurately). By that point, we've just mandated
implementing something that doesn't work (and does add
security/stability risks) and won't be used.
> 
>> boolean geotagging;// Default is false; true setting may be ignored
if UA doesn't support.  Note that if UA does not support JPEG then this
feature is disabled.
> 
> If the UA supports TIFF instead of JPEG, then I'd expect the tags to
be available.
> I'd also expect the tags to just be available from the interface, in
case I'm using PNG and just want to read the tags straight from the
system instead of out of the image.
> 
> I'm also pretty sure that the part of speech for your boolean is
wrong. "includeGeotags" or similar is probably better.
> 
> There's a similar risk for geotagging. I'd actually prefer that the
geotag be specified as "include a location coord with precision
indicator", this is distinct from features I've seen elsewhere where
such coords are also converted to supposedly human readable strings (but
which may be in all sorts of random languages, with misspellings and
other amusing errors -- this is based on my work @nokia on the n900).
> 
>> boolean highDynamicRange;// Default is false; true setting may be 
>> ignored if UA doesn't support
> 
> Similarly, I'd expect "captureHDR" or "useHDR" or something. 
> 
> Personally, on the subject of how things should be shaped, I suspect
I'd rather a single object attribute for a set of related things:
> 
> Interface FloatMinMaxCurrentValue {
> float value;
> float min;
> float max;
> }
> 	
> Interface SharpnessConstraints {
> FloatMinMaxCurrentValue sharpness;
> }
> 
> Interface RotationConstraints {
> FloatMinMaxCurrentValue rotation;
> }
> 
> Interface BrightnessConstraints {
> FloatMinMaxCurrentValue brightness;
> }
> PictureInfo implements SharpnessConstraints; PictureInfo implements 
> RotationConstraints; PictureInfo implements BrightnessConstraints;
> 
> One advantage of this, is that people are much less likely to misspell
things.
> I'm speaking as someone who just spent a week cursing a dozen groups
for not being able to spell words (jQuery: Suppress has two p's, WAI:
labeled does not have doubled L's, qunit: grr, jasmine: grr, ...).
> 
> ---------------------------------------------------------------------
> This transmission (including any attachments) may contain confidential
information, privileged material (including material protected by the
solicitor-client or other applicable privileges), or constitute
non-public information. Any use of this information by anyone other than
the intended recipient is prohibited. If you have received this
transmission in error, please immediately reply to the sender and delete
this information from your system. Use, dissemination, distribution, or
reproduction of this transmission by unintended recipients is not
authorized and may be unlawful.
> 
Received on Thursday, 13 September 2012 15:20:06 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 16:15:01 GMT