- From: Greg Norcie <gnorcie@cdt.org>
- Date: Thu, 20 Aug 2015 15:38:33 -0400
- To: "public-privacy (W3C mailing list)" <public-privacy@w3.org>
- Cc: Joe Hall <joe@cdt.org>
- Message-ID: <CAMJgV7Z=tCbMJC1d3FAP9rtQjxZCv_yE3Grw5Q3VNi+J4VtYOA@mail.gmail.com>
Hi all, I reviewed the Presentation API <http://www.w3.org/TR/presentation-api/> using the Privacy Questionnaire, results are below, followed my some discussion of what was/was not captured. Before I begin I think we should all pause and give some credit to the folks working on this standard. I think they're doing a great job working to minimize any privacy impacts that might be present. I used the most recent version of the questionnaire available on the wiki when I started (hardlink <https://www.w3.org/wiki/index.php?title=Privacy_and_security_questionnaire&oldid=85382>for future reference): 1. Does this specification have a "Privacy Considerations" section? 2. Does this specification collect personally derived data? - Not directly, however any audio/video will contain inherently privacy data 3. Does this specification generate personally derived data, and if so how will that data be handled? - Yes, this specification can collect audio/video data. Also, this spec can (in it's currently - No, the standard bundles security and privacy into one section. - (Though it should be noted they couldn't be expected to since the privacy questionnaire is in beta :) ) - Not directly, but audio/video could be used to derive a location. - How should this specification work in the context of a user agent’s "incognito" mode? - The spec should clear all permissions after an incognito, with no traces the mode was used on the machine. - While in operation, a tab that is "incognito" should be considered a separate instance from any instances in the non-incognito tabs. - Is it possible to spoof/fake the data being generated for privacy purposes? - Presumably but onus is on consumer to use software to set up a virtual device. - (IMHO this is acceptable, as long as the spec specifies it should not actively deny users the option to send video data to a virtual device... maybe this sentiment should be explicitly mentioned in the question?) - Does the standard utilize data that is personally-derived, i.e. derived from the interaction of a single person, or their device or address? If the data could be re-correlated, does the data record contain elements that would explicitly enable such re-correlation such as unique identifiers? - Yes, but aside from the usual caveats about facial recognition recorrelation does not appear to be an issue. - Does the data record contain elements that would enable re-correlation when combined with other datasets through the property of intersection? - No (just audio/video) - Is the user likely to know if information is being collected? - Yes, the user will have to interact with their computer in order to enable the presentation display. - Can the user easily, preferably through an element of the GUI, revoke consent granted to a particular feature? - Not necessarily - as I understand it there is not currently a GUI element to revoke consent to the presentation API once granted 4. Does this specification allow an origin access to a user’s location, and if so is that information minimized? Overall, I think the questionnaire is moving forward - with some language tweaks and additions I feel like we will be 80% there. but there's still some major issues... so based on my reading I plan to made several changes... I'm sharing them here rather than just diving into the wiki and editing without any chance for people to give feedback before they go into the wiki. - I'd like to remove the security section since Mike West's questions <https://w3ctag.github.io/security-questionnaire/> cover that aspect nicely, and I think forcing people to do a separate, explicit privacy review is extremely desirable. - (Too often people do a security review, assume that security is a subset of privacy, and then consider their spec review finished) - We can discuss maybe merging the two in the future, but for now I think they should stay separate. - I plan to edit the text a bit so it's more formal... this is my own fault since I wrote a large chunk of this. I know it is a draft but I feel I was way too conversational when reading several questions. - I also plan to edit the wiki formatting so we can link to individual questions, this will make it easier to discuss the questions IMHO - For question 1 ("*Does this specification have a "Privacy Considerations" section?*") we should make it clearer that the "privacy considerations section" must be on it's own (not a "privacy and security considerations" section where someone can list off their encryption techniques and avoid critical examination of privacy impacts) - For question 2 ("*Does this specification collect personally derived data?*") we should clarify this refers to what in the USA would be "PII" - adresses, SSN/national ID #, ZIP/postal code, etc. Conversely, question 3 will inquire about data collected from a user via *sensors* that may be sensitive (audio, video, telemetry data, etc) - For question 4 ("*Does this specification allow an origin access to a user’s location, and if so is that information minimized?*") mention _direct_ access to distinguish - For question 5 ("*How should this specification work in the context of a user agent’s "incognito" mode?*") we may also want to address the issue of local security vs network security in the explanation, or split into two separate questions - For question 6 ("*Is it possible to spoof/fake the data being generated for privacy purposes?*") we should make it clearer a specification should merely respect virtual devices/streams/other sources (which may be spoofed) rather than explicity creating this functionality in their specification - For question 7 ("*Does the standard utilize data that is personally-derived, i.e. derived from the interaction of a single person, or their device or address?*") it should be clarified this is referring to the traditional definition of PII, and not intended to reflect personal info such as a photo of the user. - For question 8 ("*Does the data record contain elements that would enable re-correlation when combined with other datasets through the property of intersection?*") we should rewrite it to clarify this is meant to mean fingerprinting. (Property of intersection is unnecessarily academic IMHO) - None of these questions addresses the threat of pervasive surveillance (see RFC <https://tools.ietf.org/html/rfc7258> 7258 <https://tools.ietf.org/html/rfc7258>). I propose adding a question "Does this standard protect the user against pervasive surveillance through the use of encryption (when possible)". Explanatory text can elaborate that we are referring to technologies like TLS - Question 10 (*"**Can the user easily, preferably through an element of the GUI, revoke consent granted to a particular feature?"*) and 11 (*"**Once consent has been given, is there a mechanism whereby it can be automatically revoked after a reasonable, or user configurable, period?"*) are redundant, so instead we should edit them so 10 deals with granting permission, and 11 deals with revoking it. - Just noticed #2 needs an explanation, and could probably use a quick pass for grammar (my fault since I wrote it :) ) Finally, there is one question that I'm not sure how the current questionnaire can address: How do we handle the fact that often data is only transported by a standard - how that data is used afterwards is hard to embed into spec? -- /***********************************/ *Greg Norcie (norcie@cdt.org <norcie@cdt.org>)* *Staff Technologist* *Center for Democracy & Technology* 1634 Eye St NW Suite 1100 Washington DC 20006 (p) 202-637-9800 PGP: http://norcie.com/pgp.txt Fingerprint: 73DF-6710-520F-83FE-03B5 8407-2D0E-ABC3-E1AE-21F1 /***********************************/
Received on Thursday, 20 August 2015 19:39:21 UTC