RE: Initial feedback on the well-known URI Proposal from Kevin Smith on 2012-03-06 (public-tracking@w3.org from March 2012)

From: Kevin Smith <kevsmith@adobe.com>
Date: Mon, 5 Mar 2012 16:22:12 -0800
To: "Amy Colando (LCA)" <acolando@microsoft.com>, Matthias Schunter <mts@zurich.ibm.com>, John Simpson <john@consumerwatchdog.org>
CC: "public-tracking@w3.org" <public-tracking@w3.org>
Message-ID: <6E120BECD1FFF142BC26B61F4D994CF3064CC1FF04@nambx07.corp.adobe.com>
Great idea Amy.  I think we should spend some time on this as well.  However, since it sounds to me like there is no real functional differences, there actually should be no user POV differences.  They should behave identically for the user.  I believe the question is now primarily a cost question - bandwidth, ease to implement and manage etc.  The only areas I could think of that would affect the user is if one slows down the page load, or if one is harder to implement and lengthens the amount of time it takes for a site to become compliant.  Nothing direct though.  Am I right in this assumption?

-----Original Message-----
From: Amy Colando (LCA) [mailto:acolando@microsoft.com] 
Sent: Monday, March 05, 2012 5:06 PM
To: Matthias Schunter; John Simpson
Cc: public-tracking@w3.org
Subject: RE: Initial feedback on the well-known URI Proposal

Can I put in a plug for spending some significant time (1-2 hours) on this during F2F?  I'd really appreciate a walk-through from a user's POV of each option, as well as the technical pros and cons. Thanks.

-----Original Message-----
From: Matthias Schunter [mailto:mts@zurich.ibm.com]
Sent: Monday, March 05, 2012 5:42 AM
To: John Simpson
Cc: public-tracking@w3.org
Subject: Re: Initial feedback on the well-known URI Proposal

Hi John,

thanks for the suggestion.

For simplicity reasons, I prefer to choose either URI or else headers.
Otherwise, we have to define semantics for their interplay (does a header override the info at the well-known URI or vice versa).

However, if we see that both have complementary benefits then we might allow both.

Regards,
matthias


On 3/1/2012 9:53 PM, John Simpson wrote:
> If there are differing advantages to the response header vs. the 
> well-known URI depending on the size and simplicity of the website, 
> would that imply that the spec should offer the option of responding 
> with either and leave the choice to the implementer?  I'm not 
> advocating, I'm asking...
> 
> John
> 
> On Mar 1, 2012, at 12:22 AM, Matthias Schunter wrote:
> 
>> Hi!
>>
>> I started a comparison here:
>> http://www.w3.org/wiki/DntResponseHeaderOrURI
>> Feel free to edit (I assume everyone can).
>>
>> I currently I see  three advantages for headers:
>> 1. They are much simpler than the current proposal by Roy 2. Their 
>> scoping is easier: While URI resource needs to explain its scope,
>>     the header is just attached
>> 3. Manageability in a large enterprise (say ibm.com
>> <http://ibm.com>) may be easier:
>>    The 'owners' of resources can just attach headers while
>>    for maintaining well-known URIs in a single place for all
>>    resources requires synchronisation
>>
>> An advantage for URIs is that for simple sites, they are easier; just 
>> put a minimal tracking-status at the well-known URI and you are done.
>>
>> Feedback welcome (in particular for enhancing the wiki page).
>>
>>
>> Regards,
>> matthias
>>
>>
>>
>> On 2/29/2012 10:14 PM, Kevin Smith wrote:
>>>> From reading Roy's description, it sounds to me like there is at 
>>>> least one piece of functionality available when using a URI vs a 
>>>> header - that you can request the policy before actually hitting 
>>>> the page.  This does not seem like a huge advantage to me, but it's 
>>>> nice to know the options.
>>>
>>> The question I have is, what can a header do that a URI cannot?
>>>  If, other than the above mentioned minor discrepancy, they are 
>>> functionally equivalent, which I suspect is true, then this simply 
>>> becomes a question of cost analysis.  If the benefits are 
>>> equivalent, pick the method that is easier to implement, and cheaper 
>>> to maintain and use.
>>>
>>> -----Original Message-----
>>> From: Roy T. Fielding [mailto:fielding@gbiv.com]
>>> Sent: Wednesday, February 29, 2012 12:54 PM
>>> To: Matthias Schunter
>>> Cc: public-tracking@w3.org <mailto:public-tracking@w3.org>
>>> Subject: Re: Initial feedback on the well-known URI Proposal
>>>
>>> On Feb 29, 2012, at 2:11 AM, Matthias Schunter wrote:
>>>
>>>> I now had a closer look at your proposal to transmit tracking 
>>>> status via well-known URI.
>>>>
>>>> I believe that both proposals, headers and URIs have benefits. I 
>>>> need to continue trying to understand their pros and cons.
>>>
>>> My goal was to capture all of the WG's use cases, including those 
>>> that would be prohibitively expensive to include in a header.  I am 
>>> hoping that reviewers will consider all of the possible things they 
>>> need from a response, for whatever reasons they might need them, and 
>>> make sure that the tracking status resource satisfies those cases.
>>> If not, it is relatively easy to add those cases when working with a 
>>> separate resource.
>>>
>>>> Here is some initial feedback on the proposal:
>>>>
>>>> 1. I like the URI proposal and I believe it has its merit. We need 
>>>> to understand
>>>>   whether URI/header or both are the avenue to go forward
>>>
>>> I will be sad if we can't agree to have just the resource, unless we 
>>> have a use case that cannot be satisfied by the separate resource 
>>> space.
>>>
>>>> 2. A main goal of DNT (my perspective) is simplicity and ease of 
>>>> use/understanding. I believe that the overall scheme should be 
>>>> minimalistic to keep it as simple as possible. We spent time in 
>>>> Brussels slimming the headers to the minimal info that is essential.
>>>> I'd like to do a similar exercise for your proposal.
>>>
>>> My proposal has many more details because it satisfies several more 
>>> use cases than the header proposal.  For example, the echo DNT use 
>>> case, the ability to distinguish specific exceptions, providing a 
>>> list of domains to be considered first-party, extensibility, etc.
>>> It is a complete proposal and, IMO, vastly superior to sending a 
>>> header field on every response because it satisfies all of the use 
>>> cases without impacting existing implementations or caching.
>>>
>>> It has benefited from all of the prior discussion we have had on 
>>> header fields.  It is just a different way to address the same 
>>> problem (a more Web-centric, RESTful way, if I may add, though I bet 
>>> somebody will eventually complain that application/json isn't a 
>>> hypertext type).
>>>
>>>> This means that I would omit all fields that are not essential to 
>>>> make the proposal slim and similar to the headers.
>>>> - Fields I would remove are
>>>> same-site, edits, partners, received (we agreed that it is not 
>>>> needed; it no longer exists in the headers either)
>>>
>>> That would eliminate the use cases for identifying the scope of 
>>> first-party, providing individual control over the data that has 
>>> been collected, providing fair warning (before the real resource
>>> request) of what third-party trackers are used by the site, and 
>>> echoing the DNT field back to the client to detect evil 
>>> intermediaries.  The only reason we don't have those cases handled 
>>> by the header field is because it would be prohibitively expensive 
>>> to do so in headers, either because of the size or because of the 
>>> effect on the cacheability of normal responses.  Hence, your 
>>> suggested deletions would remove most of the reasons why the 
>>> resource fulfills the needs of the privacy and regulator folks 
>>> better than the header field proposal.
>>>
>>> I am not wedded to the member names -- same-site just seemed more 
>>> natural than first-party-scope.  I am not sure if we need the use 
>>> case for partners (identifying third-parties before one goes to the 
>>> site), since that may be too hard to manage, but it should at least 
>>> be considered by the WG.
>>>
>>>> - I am not sure about the options as a separate field since the 
>>>> policy may link to it, too.
>>>
>>> Specific links to enable individual control is a requirement of the 
>>> regulators.  They should not be buried in a policy doc.
>>>
>>>> - I also would focus on fields that are usually static (e.g.,  not 
>>>> having a 'received' field)
>>>
>>> Why?  The main reason I wasn't able to convince folks at the start 
>>> of the header field discussion that a well-known resource would 
>>> satisfy their concerns is the preconception that such a resource is 
>>> always just a file -- that it couldn't be dynamic enough.  This 
>>> proposal demonstrates how dynamic it can be.
>>>
>>>> 3. I would fold 'tracking' and 'response' into a single field that 
>>>> has the same values as the headers (no-tracking, first-party, 
>>>> service-provider, tracking)
>>>
>>> I have no interest in that change, for efficiency reasons.  Most 
>>> sites do no tracking of any kind, and having that declared by a 
>>> boolean up front allows for the use case of sites that don't want to 
>>> be associated with the tracking-but-limited-to-exemptions sites.
>>>
>>>> 4. A new comment: While I understand the idea of the path field 
>>>> (scoping of status objects), I do not understand its semantics enough.
>>>> E.g., I would not know what status object to apply if there are two 
>>>> objects Well-known URIPath in Object /sub/ //sub
>>>
>>> The spec describes a specific algorithm for deciding it in 5.1.2:
>>>
>>>    A user agent may check the tracking status for a given resource 
>>> URI by
>>>    making a retrieval request for the well-known address
>>>      /.well-known/dnt
>>>    relative to that URI.
>>>    ...
>>>
>>>    Once the tracking status representation is obtained, parse the
>>>    representation as JSON to extract the Javascript status-object.
>>>    If parsing results in a syntax error, the user agent should
>>>    consider the site to be non-conformant with this protocol.
>>>
>>>    If the status-object does not have a member named path or if the 
>>> value
>>>    of path is not "/" and not a prefix of the path component for the 
>>> URI
>>>    being checked, then find the service-specific tracking status 
>>> resource
>>>    by taking the template
>>>       /.well-known/dnt{+pathinfo}
>>>    and replacing {+pathinfo} with the path component of the URI being
>>>    checked. Perform a retrieval request on the service-specific 
>>> tracking
>>>    status resource and process the result as described above to obtain
>>>    the specific tracking status.
>>>
>>> Note that the second status-object retrieved is not examined to see 
>>> if its path component is consistent -- it applies regardless.
>>>
>>>> Some more questions:
>>>> 1. Can there be multiple status-objects at one well-known URI?
>>>
>>> No, that is not allowed by the ABNF.
>>>
>>>> 2. We should attempt at finding a way to minimize the number of 
>>>> requests to the well-known URI.
>>>
>>> We already have.  In almost all real cases, there will be exactly 
>>> one per site per 24 hours (or longer if the site has declared a TTL 
>>> for this response), and only then for user agents actively verifying 
>>> the tracking status.  In all other cases, it is two requests, one 
>>> for the base "/.well-known/dnt" and a second for a specific path.
>>> If a site wants to minimize secondary requests, it can do so by 
>>> providing no more than one common path on their site per applicable 
>>> policy, which is how URI delegation works naturally.
>>>
>>>
>>> Cheers,
>>>
>>> Roy T. Fielding                     <http://roy.gbiv.com/>
>>> Principal Scientist, Adobe Systems  <http://adobe.com/enterprise>
>>>
>>>
>>>
>>>
>>>
>>>
>>
> 
> ----------
> John M. Simpson
> Consumer Advocate
> Consumer Watchdog
> 1750 Ocean Park Blvd. ,Suite 200
> Santa Monica, CA,90405
> Tel: 310-392-7041
> Cell: 310-292-1902
> www.ConsumerWatchdog.org <http://www.ConsumerWatchdog.org> 
> john@consumerwatchdog.org <mailto:john@consumerwatchdog.org>
>
Received on Tuesday, 6 March 2012 00:22:52 UTC