RE: Evolving Online Privacy - Advancing User Choice

Thank you for the feedback Joe - I'll do my best to address each of the points below in the next rev (definitions, term and verb tense corrections, add more detail to areas that the WG feel need more explanation).  I disagree on the Unlinkability assessment and believe there is a compromise position where data is no longer reasonably reverse engineer-able to identity and still useful for analysis that leads to innovation and a better Internet for everyone.

- Shane

-----Original Message-----
From: Joseph Lorenzo Hall [mailto:joehall@gmail.com] 
Sent: Wednesday, June 20, 2012 11:45 AM
To: Shane Wiley
Cc: public-tracking@w3.org
Subject: Re: Evolving Online Privacy - Advancing User Choice

Some comments on the Evolving Online Privacy proposal:

* Definitions: I think there's still some work needed in the
definitions section for this to be more clear. For example, in (C) a
first party is defined as "the party that owns the Web site or has
control over the Web site the consumer visits"; "owns" and, moreso,
"controls" here aren't defined. I doubt a user posting something to a
forum, for example, is what you intended by "controls" but to some
extent that user is changing the content of the web site that other
users will see.

   * Also, the NOTE in that definition seems to enumerate a few
actions that don't result in a first-party relationship... but is it
only those three? Is there a more neutral way to word that or maybe
list the actions that do explicitly result in a "first party widget
interaction"?

   * Later in the document, you use the word profiling as in "No
profiling" (which is bolded... which raises what the meaning of bolded
text is in this document). What "profiling" means doesn't seem
defined.

* p. 2 uses "browser agent" when I think you mean "user agent".

* typo, p.2: "DNT significantly impacted the availability" -> "DNT
significantly impacting the availability"

* I'm a little skeptical that aggregate reporting requires retaining
raw data... but I'm not familiar with how that works. (for sums, at
least, you can do a running sum... doesn't work so well with other
kinds of statistics (like median).

* I don't think I understand III (4)(b)... servers should be able to
defend a decision? That could use a bit more precision.

* Finally, the last bullet in unlinkability says you can hash a unique
ID to anonymize and that servers should rotate keys. First, this has
probably been mentioned before, but hashing a unique ID is
pseudonymization, not anonymization.  Second, cryptographic hashing
doesn't involve a key... it typically takes "salt" which can make
dictionary attacks harder.  However, unless the input is particularly
rich, the space of possible values this would "hash" to is small, so
it's not so much about rotating keys (which sounds more like HMAC) but
password-like protections for "key stretching" (e.g., running a hash
algorithm many, many times so that you effectively increase the time
(effort) an attacker would need to expend.  This is all a long winded
way of saying that if you want to begin to anonymize for
unlinkability, you'll have to simply remove unique IDs, not turn them
into another kind of unique ID.

best, Joe

-- 
Joseph Lorenzo Hall
Postdoctoral Research Fellow
Media, Culture and Communication
New York University
https://josephhall.org/

Received on Wednesday, 20 June 2012 19:03:18 UTC