Re: Privacy Guidance Draft - Your Feedback Needed from Hannes Tschofenig on 2013-07-12 (public-privacy@w3.org from July to September 2013)

From: Hannes Tschofenig <Hannes.Tschofenig@gmx.net>
Date: Fri, 12 Jul 2013 11:46:41 +0300
To: David Singer <singer@apple.com>
Cc: Hannes Tschofenig <Hannes.Tschofenig@gmx.net>, "public-privacy (W3C mailing list)" <public-privacy@w3.org>
Message-Id: <D0487D61-B97A-47A5-A6F9-0DDB8D59B4C5@gmx.net>

On Jul 12, 2013, at 11:27 AM, David Singer wrote:

> Actually, I was thinking of questions to be asked

I have collected the questions I had seen and put them (with a little bit of extra wording) in Section 4 of http://www.tschofenig.priv.at/w3c-privacy-guidelines.html
I was not able to re-use all questions that I had seen because I just didn't understood some of them (hence my mail to the list asking for clarifications).

Are the questions I currently have a reasonable staring point?

Ciao
Hannes

PS: Here is a copy-and-paste from the current document version (with the questions categorized into different areas):
Data Minimisation

Minimisation is a strategy that involves exposing as little information to other communication partners as is required for a given operation to complete. More specifically, it requires not providing access to more information than was apparent in the user-mediated access or allowing the user some control over which information exactly is provided.

For example, if the user has provided access to a given file, the object representing that should not make it possible to obtain information about that file's parent directory and its contents as that is clearly not what is expected.

Basic fingerprinting guidance can be found here. The following questions are relevant for specification authors: Is the designed protocol vulnerable to fingerprinting? If so, how? Can it be re-designed, configured, or deployed to reduce or eliminate this fingerprinting vulnerability? If not, why not?

In context of data minimisation it is natural to ask what data is passed around between the different parties, how persistent the data items and identifiers are, and whether there are correlation possibilities between different protocol runs.

For example, the W3C Device APIs Working Group has defined the following requirements in their Device API Privacy Requirements document.

APIs must make it easy to request as little information as required for the intended usage. For instance, an API call should require specific parameters to be set to obtain more information, and should default to little or no information.
APIs should make it possible for user agents to convey the breadth of information that the requester is asking for. For instance, if a developer only needs to access a specific field of a user address book, it should be possible to explicitly mark that field in the API call so that the user agent can inform the user that this single field of data will be shared.
APIs should make it possible for user agents to let the user select, filter, and transform information before it is shared with the requester. The user agent can then act as a broker for trusted data, and will only transmit data to the requester that the user has explicitly allowed.
Data minimisation is applicable to specification authors, implementers as well as to those deploying the final service.

As an example, consider mouse events. When a page is loaded, the application has no way of knowing whether a mouse is attached, what type of mouse it is (e.g., make and model), what kind of capabilities it exposes, how many are attached, and so on. Only when the user decides to use the mouse — presumably because it is required for interaction — does some of this information become available. And even then, only a minimum of information is exposed: you could not know whether it is a trackpad for instance, and the fact that it may have a right button is only exposed if it is used. For instance, the Gamepad API makes use of this data minisation capability. It is impossible for a Web game to know if the user agent has access to gamepads, how many there are, what their capabilities are, etc. It is simply assumed that if the user wishes to interact with the game through the gamepad then she will know when to action it — and actioning it will provide the application with all the information that it needs to operate (but no more than that).

The way in which the functionality is supported for the mouse is simply by only providing information on the mouse's behaviour when certain events take place. The approach is therefore to expose event handling (e.g., triggering on click, move, button press) as the sole interface to the device.

The following questions arise concerning data minimalization:

a. Identifiers. What identifiers does the protocol use for distinguishing initiators of communications? Does the protocol use identifiers that allow different protocol interactions to be correlated? What identifiers could be omitted or be made less identifying while still fulfilling the protocol's goals?
b. Data. What information does the protocol expose about individuals, their devices, and/or their device usage (other than the identifiers discussed in (a))? To what extent is this information linked to the identities of the individuals? How does the protocol combine personal data with the identifiers discussed in (a)?
c. Observers. Which information discussed in (a) and (b) is exposed to each other protocol entity (i.e., recipients, intermediaries, and enablers)? Are there ways for protocol implementers to choose to limit the information shared with each entity? Are there operational controls available to limit the information shared with each entity?
d. Fingerprinting. In many cases the specific ordering and/or occurrences of information elements in a protocol allow users, devices, or software using the protocol to be fingerprinted. Is this protocol vulnerable to fingerprinting? If so, how? Can it be designed to reduce or eliminate the vulnerability? If not, why not?
e. Persistence of identifiers. What assumptions are made in the protocol design about the lifetime of the identifiers discussed in (a)? Does the protocol allow implementers or users to delete or replace identifiers? How often does the specification recommend to delete or replace identifiers by default? Can the identifiers, along with other state information, be set to automatically expire?
f. Correlation. Does the protocol allow for correlation of identifiers? Are there expected ways that information exposed by the protocol will be combined or correlated with information obtained outside the protocol? How will such combination or correlation facilitate fingerprinting of a user, device, or application? Are there expected combinations or correlations with outside data that will make users of the protocol more identifiable?
g. Retention. Does the protocol or its anticipated uses require that the information discussed in (a) or (b) be retained by recipients, intermediaries, or enablers? If so, why? Is the retention expected to be persistent or temporary?

Trade-offs

Designing privacy features into a protocol or architecture often requires tradeoffs to be made. As a designer we often make these decisions without spending too much thoughts about them. For others, this decision making process is, however, important to judge the value of certain design decisions.

Does the protocol make trade-offs between privacy and usability, privacy and efficiency, privacy and implementability, privacy and security, or privacy and other design goals? Capture these trade-offs and the rationale for the design chosen.

Defaults

Protocol often come with flexible options so that it can be tailored to specific environments. Does the default mode minimize the amount, identifiability, and persistence of the data and identifiers exposed by the protocol? Does the default mode or option maximize the opportunity for user participation? Does it provide the strictest security features of all the modes/options?

If any of these answers are no, explain why less protective defaults were chosen.

User Participation

Many Web protocols allow data to be made available through APIs and other protocol mechanisms. It is important for users to be able to control access to their data. What controls or consent mechanisms does the protocol define or require before personal data or identifiers are shared or exposed via the protocol/API?

Does the user have control over their data after it has been shared with other parties, for example regarding secondary use? Are users able to determine what information was shared with other parties?

What recommendations can be given to implementers and the deployment community regarding privacy-friendly sharing of information? This in particular refers to the ability to provide additional information about why the sharing is taking place (the purpose), what information is shared and with whom. Further relevant is to control the degree (or the granularity) of information sharing. Will the data subject be given enough context to make an informed decision? Will the decision about sharing with others be persistent? If so, for how long is the decision cached and how can it be revoked?

Received on Friday, 12 July 2013 08:47:13 UTC