Re: Towards a Grand Compromise from Roy T. Fielding on 2012-06-16 (public-tracking@w3.org from June 2012)

From: Roy T. Fielding <fielding@gbiv.com>
Date: Sat, 16 Jun 2012 02:53:06 -0700
To: Jonathan Mayer <jmayer@stanford.edu>
Cc: public-tracking@w3.org
Message-Id: <9F145BFB-29BA-4D6A-8787-0DCD91336740@gbiv.com>
Regarding the "Do Not Track - Compromise Proposal" of 11 June 2012:

I am disappointed that this proposal repeats so much of the current
compliance document that has been repeatedly rejected by those
who are expected to implement it.  The key to any consensus process
is to make positive steps toward adoption by all members, not by
splitting the difference on never-going-to-be-implemented features.

Fundamental problems
====================

0) This is not the collection protection working group -- it is the
   tracking protection working group and it has a chartered delivery
   requirement for a document that defines the meaning of a do not
   track preference.

   DNT expresses "do not track".  We either agree on a definition for
   tracking or we don't have a protocol.  When we have a definition for
   tracking, all of the constraints on data collection will be a subset
   of that definition.  Adobe will not accept blanket constraints on all
   data collection based on the theory that a few exceptions will cover
   our needs: we have no way to anticipate all of the future needs of
   our customers that might have nothing to do with tracking.

   The user is asking not to be tracked, to which we can comply if we
   have a specific definition that includes both the process of tracking
   and data collection enabled by that tracking.  Some regulators are
   requiring prior consent for collection of personal information,
   to which we can comply by obtaining that prior consent or by
   excluding the personal information from retention. We have no
   interest in requirements for DNT that exceed these two mandates
   from users and regulators.  If additional privacy issues arise,
   they should be addressed regardless of whether the user has
   transmitted DNT; thus, they are outside our current scope.

   Data collection that is not associated with user activity and
   which does not represent an identified privacy issue should not
   be constrained by this standard.  If privacy violations occur,
   the responsible parties should be punished regardless of DNT.
   There is no need for DNT to cover all privacy concerns.


1) We have not found a way for a UA to indicate whether a given
   request is intentional or not, and we cannot design based on
   meaningless phrases like "infer with a high probability".
   The distinction between first and third party resource must either
   be testable via the protocol or determinable independent of
   specific requests.  
   
   A server cannot prevent any given first party resource from being
   reused in what the WG would consider someone else's first party
   context (making it a third party resource).  Hence, we have no way
   of complying with the definition as stated other than by always
   adhering to the third-party constraints.

   There is a simple solution: define the type of resource according
   to how the resource owner has designed it to be used, consistent
   with how that party chooses to use the resource on its own sites,
   and how it has documented the resource for use by others.
   TPE is already defined in those terms.
 

2) Outsourcing is not an exception -- it is the rule.  When we say a
   third party is acting as a first party, they *are* the first party.
   Likewise, when a contractor is acting on behalf of a third party,
   they are that third party.

   Very few sites on the Web fall into the billionaire's club of large
   Web providers that can afford to build their own analytics and
   advertising businesses within the same corporate shell.  The ability
   to contract for services from the likes of Adobe, AWS, and Akamai is
   what allows the small mom-and-pop web sites to compete and grow their
   businesses incrementally.  Creating an artificial distinction between
   in-house proprietary services and contractually-provided services is
   not acceptable because it would create an artificial market advantage
   for conglomerates, which would in turn invite scrutiny of this group
   for anti-trust concerns.

   The privacy issues regarding service provider data collection can be
   covered by constraints on ownership and sharing of data.  Adobe won't
   accept any requirement that prevents a contractor, acting on behalf
   of a first party customer, from doing or implementing anything that
   a conglomerate could have done within the first party constraints.

   The solution is to define outsourced service providers as the same
   party if they only act as data processors on behalf of that party,
   silo the data so that it cannot be accessed by other parties, and
   have no control over that data except as directed by that party.
   Hence, requirements on properly outsourced service providers for a
   party (first or third) should not be any more restrictive or
   permissive than the requirements on that party's employees.


3) We cannot comply with any requirements on "receives".

   What a server receives is determined by the user agent.
   Any requirement on what we receive would cause us to be non-compliant
   as soon as some joker decides to use curl to send us more information
   than we would expect, or when some lazy developer copies and pastes
   HTML content from one our own websites within a third party context
   that we did not expect to receive data.

   There is no justification for those requirements in the spec.


4) How we implement something is none of your business.

   The proposal contains numerous half-baked ideas on how various services
   shall be implemented.  None of that belongs in a compliance document for
   a network protocol.  Requirements must be on externally observable events
   that are known to be privacy sensitive, not internal implementations.
   Otherwise, the spec would inhibit innovation and prevent adequate
   solutions that might better fit the scale or sensitivity of data for
   a given context.

   If you have specific solutions in mind for better ways of advertising,
   auditing, or analytics, then I encourage you to start your own business
   (or open source charity) and demonstrate the merits of that implementation
   in practice.

   What this WG needs to concern itself with is actual loss of privacy,
   not theoretical ideas of what might constitute a bulletproof service.
 

5) Unlinkability does not need to be expressed in terms of probability.
   
   Data is not personally identifiable if it cannot be associated with
   an individual.  Data that is not personally identifiable should have a
   blanket exception -- there is no need for additional requirements on
   validation, and certainly not a magic number like 1024-unlinkable.

   DNT requirements should only apply to the handling of linkable data.
   Probability would only estimate the likelihood of failure to comply.


6) The requirements should distinguish between communication records
   (log files) and data collection for operational use.  Log files should
   not be subject to limitations on tracking if they are not used for
   operations other than what is necessary to support the permitted uses.
   Log files are still subject to other limitations under data protection
   laws, but that is independent of whether the user asks for DNT.

   We only need to require that:
     a) log files record the DNT status for each request until the
        records are destroyed or rendered unlinkable; and,
     b) aside from the permitted uses, processing a logfile is only
        allowed if all records marked as DNT:1 are excluded from that
        processing or if the data derived as a result of the
        processing is entirely unlinkable.


Specific comments
=================

> 2. Parties, First Parties and Third Parties

  As described above, service providers that are only data processors
  should be considered within the same party as the contracting party.

  Differing constraints on third party versus first party services need
  to be rephrased in terms of the design of each resource rather than
  the intent of the user.

  If a resource is designed for direct interaction, is only used by
  the resource owner on its own sites for direct interaction, and
  is not documented by the resource owner for use as an embedded API
  for other sites, then the resource need only comply with first-party
  requirements.  Otherwise, the resource must comply with third-party
  requirements unless it can dynamically determine that it has been
  invoked in a first party context.

> 3. Information Practices
> 
>   3.1 Reception, Retention, Use, and Sharing
> 
>    A party receives data if the data comes within its control.

irrelevant

>    A party retains data if the data remains within the party's control.

"... beyond the scope of the current interaction."

>    A party uses data if the party processes the data for any purpose,
>    including for storage.

"... other than merely forwarding it to another party."

>    A party shares data if the party enables another party to receive the
>    data.

"A party shares data if it allows any other party to receive or
access that data."

>   3.2 First Party
> 
>    A first party must not share information with a third party that the third
>    party is prohibited from receiving itself.

There is no way for a first party to know what a third party is
prohibited from receiving, both because we can't prohibit receiving
and because the potential for prior consent always exists.

>    Best Practice 1: Additional Voluntary Measures
> 
>    A first party may voluntarily take steps to protect user privacy when
>    responding to a Do Not Track request.

That text serves no useful purpose.

>   3.3 Third Party
> 
>     3.3.1 General Rule
> 
>    A third party must not receive, retain, use, or share any information
>    related to communication with a user or user agent. There are exceptions
>    to this general rule as defined in the following sections. In case of
>    ambiguity, an exception must be construed narrowly. Each exception
>    operates independently; exceptions cannot be combined except where
>    explicitly noted otherwise.

No, exceptions are permissions and they combine just fine.

As mentioned a dozen times before, I object to blanket prohibition
of things that a third party obviously has to do just to process
a request, even if it is followed by a specific list of exceptions
that might or might not be enough to actually process that request.
If you can't formulate a requirement in terms of actual tracking
or the data collected as a result of tracking, then I will not
implement that requirement.  As such, this entire section 3.3
is not suitable for our specification.

>       3.3.2.3 Outsourcing

Outsourcing is a general aspect of implementation by any party.
It should not be buried under third party.

>    A first party may outsource website functionality to a third party, in
>    which case the third party may act as the first party under this standard
>    with the following additional restrictions.
> 
>         3.3.2.3.1 Technical Precautions
> 
>         3.3.2.3.1.1 Operative Text
> 
>    Throughout all data reception, retention, and use, outsourced service
>    providers must use all feasible technical precautions to both mitigate the
>    linkability of and prevent the linking of data from different first
>    parties.

Nonsense, that's unimplementable in practice (nobody knows what
"all feasible technical precautions" are at any given moment).
Furthermore, this requirement is not specific to outsourcing --
it would apply to any third party.  And in the specific case of
outsourcing, we don't have control over the data and thus cannot
prevent two different parties from collecting the same identifiable
information, thereby making the data sets linkable by accident.

>    Structural separation ("siloing") of data per first party, including both
>     1. separate data structures and
>     2. avoidance of shared unique identifiers
>    are necessary, but not necessarily sufficient, technical precautions.

Not phrased as a testable requirement.  Moreover, shared identifiers
are only a tracking concern if they are retained by the server in a
form that could be correlated across multiple first parties.

>         3.3.2.3.1.2 Non-Normative Discussion

Actually, the contents of this section consists of a number of
suggested best practices with an occasional mixture of normative
"must" and "should" statements.  They do not belong here.

...

>         3.3.2.3.1.2.2 Siloing in the Backend

See general note about "none of your business".  This section and all
subsections are not suitable for our documents.
...

>         3.3.2.3.1.2.3 Retention in the Backend
> 
>    An outsourcing service should retain information only so long as necessary
>    to provide necessary functionality to a first party. If a service creates
>    periodic reports, for example, it should delete the data used for a report
>    once it is generated. An outsourcing service should be particularly
>    sensitive to retaining protocol logs, since they may allow correlating
>    user activity across multiple first parties.

There is no chance that Adobe will agree to delete data that belongs
to the first party just because we are a contractor.  By definition,
we do not own that data.  And protocol logs are not specific to DNT.

...

>         3.3.2.3.3 Use Direction
> 
>    An outsourced service
>     1. must use data retained on behalf of a first party ONLY on behalf of
>        that first party, and

That belongs in the definition of outsourced service provider.

>     2. must not use data retained on behalf of a first party for their own
>        business purposes, or for any other reasons.

Too broad.  Please use the definition of data processor from the EU.

Data must be used in order to retain it in the first place;
that is the outsourcer's business purpose.  The outsourcer might
provide continued storage and backups of the data, along with a
suite of tools for use by the party for mining their data to
produce reports; dissemination of those reports is entirely
controlled by the contracting party, not the outsourcer.

An outsourcer must be able to do anything that a party can do with
that data, so long as the outsourcer is acting at the direction
of that party. Furthermore, the data must be usable by the
outsourcer as necessary for capacity planning and billing the
party for services provided.  Likewise, aggregate statistics
based on received requests (across all parties) can be used if
the statistics do not contain anything linkable to persons.

In general, DNT must not apply to outsourced service providers
acting as a first party any more than it applies to first
parties: no sharing is allowed beyond the contracting party
relationship.

...

>       3.3.2.4 User Permission
> 
>    A website my engage in practices otherwise prohibited by this standard if
>    a user grants permission. Permission may be attained through the browser
>    API defined in the companion Tracking Preference Expression document. A
>    website may also rely on "out-of-band" consent attained through a
>    different technology. An "out-of-band" choice mechanism has the same
>    effect under this standard as the browser exception API, provided that it
>    satisfies the following bright-line requirements:
>     1. Actual presentation: The choice mechanism must be actually presented
>        to the user. It must not be on a linked page, such as a terms of
>        service or privacy policy.
>     2. Clear terms: The choice mechanism must use clear, non-confusing
>        terminology.
>     3. Independent choice: The choice mechanism must be presented independent
>        of other choices. It must not be bundled with other user preferences.
>     4. No default permission: The choice mechanism must not have the user
>        permission preference selected by default.

Actually, an out of band consent mechanism provides additional
permissions, whether or not they fall within the scope of this
specification.  In all cases, prior consent overrides DNT because
it is the only means of enabling the user to selectively allow
data collection by specific trusted parties without constantly
having to change their browser configuration.

There is still no need to define how prior consent is obtained,
provided we define it as specific, explicit, and informed.

>    An "out-of-band" choice mechanism must additionally satisfy the following
>    high-level standard:
> 
>    An ordinary user would know that the choice overrides his or her privacy
>    protections under this standard.

No. Most consent actions do not, in any way, harm a user's privacy,
and ordinary users don't read standards.

>       3.3.2.5 Security
> 
>         3.3.2.5.1 Operative Text
> 
>    A third party may receive, retain, and use data about a particular user or
>    user agent for the purpose of ensuring its security, provided that there
>    are reasonable grounds to believe the user or user agent was attempting to
>    breach the party's security at the time the data was received.
> 
>    Note: This draft does not address the extent to which technical and
>    business precautions are required for security data.

As already explained, DNT must not have any effect on security or
fraud controls, period.  That is not negotiable.  Data retention for
those purposes may be limited to what is necessary for those purposes,
but the notion that sending "DNT: 1" has any effect whatsoever on
security and fraud controls is a non-starter.


Cheers,

....Roy T. Fielding, Principal Scientist, Adobe Systems Inc.
    <http://adobe.com/enterprise>
Received on Saturday, 16 June 2012 09:53:32 UTC