Re: Results from Privacy review of Presentations API using Privacy Questionaire. (Wall of text warning!) from Joseph Lorenzo Hall on 2015-08-27 (public-privacy@w3.org from July to September 2015)

From: Joseph Lorenzo Hall <joe@cdt.org>
Date: Thu, 27 Aug 2015 10:59:52 -0400
To: Greg Norcie <norcie@cdt.org>
Cc: "public-privacy (W3C mailing list)" <public-privacy@w3.org>
Message-ID: <CABtrr-V4Z+GEidNf2CV6iwZLMz7A45jwnW1WphULqczmrbkDNw@mail.gmail.com>
This is great, Greg... some comments inline. I hope others have had a
chance to take a look at the questionnaire and examining a spec with the
questions in had seems to be very useful.

On Thu, Aug 20, 2015 at 3:38 PM, Greg Norcie <gnorcie@cdt.org> wrote:

> Hi all,
>
> I reviewed the Presentation API <http://www.w3.org/TR/presentation-api/>
> using the Privacy Questionnaire, results are below, followed my some
> discussion of what was/was not captured.
>
> Before I begin I think we should all pause and give some credit to the
> folks working on this standard. I think they're doing a great job working
> to minimize any privacy impacts that might be present.
>
> I used the most recent version of the questionnaire available on the wiki
> when I started (hardlink
> <https://www.w3.org/wiki/index.php?title=Privacy_and_security_questionnaire&oldid=85382>for
> future reference):
>
>    1. Does this specification have a "Privacy Considerations" section?
>
> Does it? Sounds like from below it has a "security and privacy" section
but not a stand-alone privacy section.

>
>    1. Does this specification collect personally derived data?
>       - Not directly, however any audio/video will contain inherently
>       privacy data
>       2. Does this specification generate personally derived data, and if
>    so how will that data be handled?
>       - Yes, this specification can collect audio/video data. Also, this
>       spec can (in it's currently
>
> Hmm, seems like some text was cut off here.

>
>    -
>          - No, the standard bundles security and privacy into one section.
>          - (Though it should be noted they couldn't be expected to since
>          the privacy questionnaire is in beta :) )
>          - Not directly, but audio/video could be used to derive a
>          location.
>          - How should this specification work in the context of a user
>       agent’s "incognito" mode?
>          - The spec should clear all permissions after an incognito, with
>          no traces the mode was used on the machine.
>          - While in operation, a tab that is "incognito" should be
>          considered a separate instance from any instances in the non-incognito tabs.
>          - Is it possible to spoof/fake the data being generated for
>       privacy purposes?
>          - Presumably but onus is on consumer to use software to set up a
>          virtual device.
>          - (IMHO this is acceptable, as long as the spec specifies it
>          should not actively deny users the option to send video data to a virtual
>          device... maybe this sentiment should be explicitly mentioned in the
>          question?)
>          - Does the standard utilize data that is personally-derived,
>       i.e. derived from the interaction of a single person, or their device or
>       address? If the data could be re-correlated, does the data record contain
>       elements that would explicitly enable such re-correlation such as unique
>       identifiers?
>          - Yes, but aside from the usual caveats about facial recognition
>          recorrelation does not appear to be an issue.
>          - Does the data record contain elements that would enable
>       re-correlation when combined with other datasets through the property of
>       intersection?
>          - No (just audio/video)
>
> This seems like a hard question... on the one hand, if a "face" is enough
from which to derive a facial pattern that you can correlate with other
databases of facial patterns, then the answer would seem to be yes
(although I don't know of any public databases of facial biometrics). Maybe
there's a better way to get at what this question wants to get at? Does
anyone remember what the impetus for this question is? or can we think of
examples in a spec that we'd definitely want to catch with this question?

>
>    -
>          - Is the user likely to know if information is being collected?
>          - Yes, the user will have to interact with their computer in
>          order to enable the presentation display.
>          - Can the user easily, preferably through an element of the GUI,
>       revoke consent granted to a particular feature?
>          - Not necessarily - as I understand it there is not currently a
>          GUI element to revoke consent to the presentation API once granted
>          1. Does this specification allow an origin access to a user’s
>    location, and if so is that information minimized?
>
> Sounds like this last one is "no"?


> Overall, I think the questionnaire is moving forward - with some language
> tweaks and additions I feel like we will be 80% there.
>
> but there's still some major issues... so based on my reading I plan to
> made several changes... I'm sharing them here rather than just diving into
> the wiki and editing without any chance for people to give feedback before
> they go into the wiki.
>
>
>    - I'd like to remove the security section since Mike West's questions
>    <https://w3ctag.github.io/security-questionnaire/> cover that aspect
>    nicely, and I think forcing people to do a separate, explicit privacy
>    review is extremely desirable.
>    - (Too often people do a security review, assume that security is a
>       subset of privacy, and then consider their spec review finished)
>       - We can discuss maybe merging the two in the future, but for now I
>       think they should stay separate.
>       - I plan to edit the text a bit so it's more formal... this is my
>    own fault since I wrote a large chunk of this. I know it is a draft but I
>    feel I was way too conversational when reading several questions.
>    - I also plan to edit the wiki formatting so we can link to individual
>    questions, this will make it easier to discuss the questions IMHO
>    - For question 1 ("*Does this specification have a "Privacy
>    Considerations" section?*")  we should make it clearer that the
>    "privacy considerations section" must be on it's own (not a "privacy and
>    security considerations" section where someone can list off their
>    encryption techniques and avoid critical examination of privacy impacts)
>    - For question 2 ("*Does this specification collect personally derived
>    data?*")  we should clarify this refers to what in the USA would be
>    "PII" - adresses, SSN/national ID #, ZIP/postal code, etc. Conversely,
>    question 3 will inquire about data collected from a user via *sensors*
>    that may be sensitive (audio, video, telemetry data, etc)
>
> Wondering what non-US folks think of this... we can probably make it
pretty universal by talking about personal data a la the EU.

>
>    -
>    -  For question 4 ("*Does this specification allow an origin access to
>    a user’s location, and if so is that information minimized?*") mention
>    _direct_ access to distinguish
>
> What did you want to get at here, Greg?

>
>    -
>    - For question 5 ("*How should this specification work in the context
>    of a user agent’s "incognito" mode?*") we may also want to address the
>    issue of local security vs network security in the explanation, or split
>    into two separate questions
>    - For question 6 ("*Is it possible to spoof/fake the data being
>    generated for privacy purposes?*") we should make it clearer a
>    specification should merely respect virtual devices/streams/other sources
>    (which may be spoofed) rather than explicity creating this functionality in
>    their specification
>    - For question 7 ("*Does the standard utilize data that is
>    personally-derived, i.e. derived from the interaction of a single person,
>    or their device or address?*") it should be clarified this is
>    referring to the traditional definition of PII, and not intended to reflect
>    personal info such as a photo of the user.
>    - For question 8 ("*Does the data record contain elements that would
>    enable re-correlation when combined with other datasets through the
>    property of intersection?*") we should rewrite it to clarify this is
>    meant to mean fingerprinting. (Property of intersection is unnecessarily
>    academic IMHO)
>    - None of these questions addresses the threat of pervasive
>    surveillance (see RFC <https://tools.ietf.org/html/rfc7258> 7258
>    <https://tools.ietf.org/html/rfc7258>). I propose adding a question
>    "Does this standard protect the user against pervasive surveillance through
>    the use of encryption (when possible)". Explanatory text can elaborate that
>    we are referring to technologies like TLS
>
> I like this; wondering what others think.


>
>    - Question 10 (*"**Can the user easily, preferably through an element
>    of the GUI, revoke consent granted to a particular feature?"*) and 11 (
>    *"**Once consent has been given, is there a mechanism whereby it can
>    be automatically revoked after a reasonable, or user configurable, period?"*)
>    are redundant, so instead we should edit them so 10 deals with granting
>    permission, and 11 deals with revoking it.
>    - Just noticed #2 needs an explanation, and could probably use a quick
>    pass for grammar (my fault since I wrote it :) )
>
> Finally, there is one question that I'm not sure how the current
> questionnaire can address: How do we handle the fact that often data is
> only transported by a standard - how that data is used afterwards is hard
> to embed into spec?
>
>
I think this is out of scope... unless we can think of a way to get this in
(would it be to recommend spec authors put language in their specs that
talk about the risks of storing data when marshaled out of the UA?). best,
Joe


> --
> /***********************************/
>
> *Greg Norcie (norcie@cdt.org <norcie@cdt.org>)*
>
> *Staff Technologist*
> *Center for Democracy & Technology*
> 1634 Eye St NW Suite 1100
> Washington DC 20006
> (p) 202-637-9800
> PGP: http://norcie.com/pgp.txt
>
> Fingerprint:
> 73DF-6710-520F-83FE-03B5
> 8407-2D0E-ABC3-E1AE-21F1
>
> /***********************************/
>



-- 
Joseph Lorenzo Hall
Chief Technologist
Center for Democracy & Technology
1634 I ST NW STE 1100
Washington DC 20006-4011
(p) 202-407-8825
(f) 202-637-0968
joe@cdt.org
PGP: https://josephhall.org/gpg-key
fingerprint: 3CA2 8D7B 9F6D DBD3 4B10  1607 5F86 6987 40A9 A871
Received on Thursday, 27 August 2015 15:00:44 UTC