Re: action-317, discussion of the service-provider flag and the same-party resource from Roy T. Fielding on 2013-01-22 (public-tracking@w3.org from January 2013)

From: Roy T. Fielding <fielding@gbiv.com>
Date: Tue, 22 Jan 2013 03:22:43 -0800
To: David Singer <singer@apple.com>
Cc: "public-tracking@w3.org WG" <public-tracking@w3.org>
Message-Id: <3751C2DC-56CF-4FFF-A68B-821AF7DC1C7F@gbiv.com>
This message completes ACTION-354 and regards ISSUE-137 ...

On Nov 28, 2012, at 4:07 PM, David Singer wrote:

> This discussion started in <http://www.w3.org/mid/43B85F59-7BD8-43D7-80B3-20F8FF1DEB1A@apple.com>.  This is the reference I was searching for on the call today.
> 
> 
> I want to remind people we're not talking about 'all' kinds of service provision here.  We're specifically not talking about ISPs, hosting, content authoring, or anything EXCEPT services which are distinct end-points of HTTP transactions.

Why?  To be clear, why are those other service providers that are fully
capable of tracking the user treated differently from a site operator
(assuming that's what you mean by distinct end-points)?
What about CDNs like Akamai?
What about AWS?

I sometimes think we aren't being clear on the definitions of party
and service provider.  According to the compliance document, all
contractors that work under NDA are either service providers
or third parties.  That means any company that involves a contractor
in the provision of service "at the HTTP endpoint" is a service
provider, not a first party, because the data has been shared
with the contractor.  All of Apple's first-party websites, for example,
are actually service providers.  So would any site that is popular
enough to require multiple data centers or protection against
denial of service attacks.

> The basic problem we're trying to avoid is having a sensitive user and/or user-agent flag a site as suspiciously claiming rights it doesn't seem to have -- either first-party status when it's not the first party, or consent when it doesn't appear to have 3rd-party-with-consent status.

Again, why?  If we thought that was an actual problem, wouldn't the
service providers be the ones wanting to address it? Why would a
suspicious party be trusted just because they send an "s"?

> Let's take a simple hypothetical example.
> 
> VisitCounter.com offers a 'basic' and a 'pro' service. They provide 3 levels of service; free, basic, and pro.  Free service gives you an widget iFrame you can embed, that visibly counts visitors.  Contract for basic, and they will provide you a report on your visitors -- silo'd, and so on -- but the domain is  still theirs.  Pro, and they'll register under visits.<yourdomain>, and also provide cookie analysis, and so on.
> 
> For the basic service, they look like a different site, different party.  And they appear under that name on multiple sites, say examplebank.com and examplenews.com.
> 
> examplenews.com and examplebank.com could add them into their same-party array, but if they add both examplebank and examplenews into their same-party array, that appears to be a claim that those two are 'the same party', which is not true.  If they don't add them, we have no indication or back-pointer.  Even if they add dynamically, somehow detecting where they are embedded, the risk remains, I think, that someone will join these later.

I can't follow your argument here.  Who is "they"?

In the current draft, examplenews.com would have a tracking status
resource (TSR) on its own site. If examplenews.com's same-party array
includes VisitCounter.com, this means that examplenews.com claims
that data collected via the references from examplenews.com to
VisitCounter.com remains under the exclusive control of examplenews.com
(i.e., VisitCounter.com is either the same party or a service provider
acting as that same party, and thus subject to the data controls that
give VisitCounter.com permission to track with siloing).

> Adding the 'I am a service provider' flag to their response, however, can help make it clear that they stand in a *service provider* relationship with some or all of the sites in the same-party array.

I think this is the source of confusion.  The same-party array is
at examplenews.com and examplebank.com.  The contents of the same-party
array at VisitCounter.com would only be relevant if VisitCounter.com
subcontracts its own services under the service provider rules;
it would never contain examplenews.com and examplebank.com.

> Big sites can afford to pay for 'pro', so the argument that the flag discriminates in favor of the large doesn't hold.
> 
> What am I missing?

The only possible reason to use a flag to indicate that a service provider
is involved in the provision of services is to perform some form of
automated discrimination against service providers.  A flag provides
no data transparency to the user.  What we already have in the draft
is better than an "s" flag.

Let's consider your scenario above, but without any "s" flag.

The user agent sees a reference to VisitCounter.com on a page at
examplenews.com.  Since it is in paranoid mode, it wants to verify
that all links will adhere to DNT.  It probably already has a cached
copy of the TSR for examplenews.com and sees that it either

  a) does list VisitCounter.com under same-party
  b) does not list VisitCounter.com under same-party

If (a), then examplenews.com is claiming to have control over data
collected by VisitCounter.com.  If (b), no information is gained.

The UA then requests the TSR for VisitCounter.com.  It will either
decline to implement DNT or eventually provide a tracking status
value (TSV) that applies to the link used at examplenews.com to
reference its service.  That final TSV will be one of

  !) same as declines to implement DNT;
  C) it is a third party that has the user's consent to track;
  3) meaning that it is obeying the requirements on a third party
     regardless of what examplenews.com said; or,
  1) meaning that it is either the first party for this designated
     resource or a service provider acting as a first party.

The UA then looks at the new "first-party" member in the TSR for
VisitCounter.com.  That member is one of:

  e) not provided
  f) a link to a resource that says VisitCounter.com is the
     data controller
  g) a link to a resource that says examplenews.com is the
     data controller

Now, let's break it down by the options listed above for this
particular designated resource at VisitCounter.com:

  If (!), VisitCounter.com can't be trusted regardless.
  If (3), it is a third party that adheres to DNT.
  If (C), it can do whatever was consented by the user.
  Hence, only (1) matters.

  If (1) and (e)/(f), then VisitCounter.com considers itself to be
  the first party, which should be a concern for the UA regardless
  of what is claimed by examplenews.com in (a) or (b).  In fact,
  it might be a service provider (good news), but is contractually
  prevented from claiming that it provides service on behalf of
  examplenews.com -- they'll just have to live with the lost traffic
  from concerned UAs.

  If (1) and (g), then VisitCounter.com is claiming to be acting
  as a first party on behalf of examplenews.com.  Hence, it is
  claiming to be a service provider for examplenews.com.
  But is it lying?  Well, if (a) is the case, then we have a
  confirmation from examplenews.com that it considers its links
  to VisitCounter.com to be the same party.  If (b) is the case,
  then one of the following is true:

    y) VisitCounter.com is a service provider, but examplenews.com
       is either unwilling or too lazy to advertise its own service
       provider relationships in same-party; or

    z) VisitCounter.com is lying about being a service provider to
       examplenews.com.

  The only way for the user to find out if (y) or (z) is true is
  for the user to ask examplenews.com via some non-automated means.
  
Note that having an "s" flag doesn't change any of the above cases.
It is not useful since we added the policy link, which has now been
replaced by an even clearer first-party link.  The scenario you are
trying to address has already been addressed with a mechanism that
provides actual transparency to the user, to the extent that it is
allowed by the first party.


Cheers,

Roy T. Fielding                     <http://roy.gbiv.com/>
Senior Principal Scientist, Adobe   <https://www.adobe.com/>
Received on Tuesday, 22 January 2013 11:23:00 UTC