(unknown charset) Re: Initial feedback on the well-known URI Proposal from (unknown charset) Matthias Schunter on 2012-03-05 (public-tracking@w3.org from March 2012)

From: (unknown charset) Matthias Schunter <mts@zurich.ibm.com>
Date: Mon, 05 Mar 2012 14:41:39 +0100
To: (unknown charset) John Simpson <john@consumerwatchdog.org>
CC: (unknown charset) public-tracking@w3.org
Message-ID: <4F54C293.1040703@zurich.ibm.com>
Hi John,

thanks for the suggestion.

For simplicity reasons, I prefer to choose either URI or else headers.
Otherwise, we have to define semantics for their interplay (does a
header override the info at the well-known URI or vice versa).

However, if we see that both have complementary benefits then we might
allow both.

Regards,
matthias


On 3/1/2012 9:53 PM, John Simpson wrote:
> If there are differing advantages to the response header vs. the
> well-known URI depending on the size and simplicity of the website,
> would that imply that the spec should offer the option of responding
> with either and leave the choice to the implementer?  I'm not
> advocating, I'm asking...
> 
> John
> 
> On Mar 1, 2012, at 12:22 AM, Matthias Schunter wrote:
> 
>> Hi!
>>
>> I started a comparison here:
>> http://www.w3.org/wiki/DntResponseHeaderOrURI
>> Feel free to edit (I assume everyone can).
>>
>> I currently I see  three advantages for headers:
>> 1. They are much simpler than the current proposal by Roy
>> 2. Their scoping is easier: While URI resource needs to explain its
>> scope,
>>     the header is just attached
>> 3. Manageability in a large enterprise (say ibm.com
>> <http://ibm.com>) may be easier:
>>    The 'owners' of resources can just attach headers while
>>    for maintaining well-known URIs in a single place for all
>>    resources requires synchronisation
>>
>> An advantage for URIs is that for simple sites, they are easier; just
>> put a minimal tracking-status at the well-known URI and you are done.
>>
>> Feedback welcome (in particular for enhancing the wiki page).
>>
>>
>> Regards,
>> matthias
>>
>>
>>
>> On 2/29/2012 10:14 PM, Kevin Smith wrote:
>>>> From reading Roy's description, it sounds to me like there is at
>>>> least one piece of functionality available when using a URI vs a
>>>> header - that you can request the policy before actually hitting
>>>> the page.  This does not seem like a huge advantage to me, but
>>>> it's nice to know the options.
>>>
>>> The question I have is, what can a header do that a URI cannot?
>>>  If, other than the above mentioned minor discrepancy, they are
>>> functionally equivalent, which I suspect is true, then this simply
>>> becomes a question of cost analysis.  If the benefits are
>>> equivalent, pick the method that is easier to implement, and
>>> cheaper to maintain and use.
>>>
>>> -----Original Message-----
>>> From: Roy T. Fielding [mailto:fielding@gbiv.com]
>>> Sent: Wednesday, February 29, 2012 12:54 PM
>>> To: Matthias Schunter
>>> Cc: public-tracking@w3.org <mailto:public-tracking@w3.org>
>>> Subject: Re: Initial feedback on the well-known URI Proposal
>>>
>>> On Feb 29, 2012, at 2:11 AM, Matthias Schunter wrote:
>>>
>>>> I now had a closer look at your proposal to transmit tracking status
>>>> via well-known URI.
>>>>
>>>> I believe that both proposals, headers and URIs have benefits. I need
>>>> to continue trying to understand their pros and cons.
>>>
>>> My goal was to capture all of the WG's use cases, including those
>>> that would be prohibitively expensive to include in a header.  I am
>>> hoping that reviewers will consider all of the possible things they
>>> need from a response, for whatever reasons they might need them,
>>> and make sure that the tracking status resource satisfies those
>>> cases.  If not, it is relatively easy to add those cases when
>>> working with a separate resource.
>>>
>>>> Here is some initial feedback on the proposal:
>>>>
>>>> 1. I like the URI proposal and I believe it has its merit. We need to
>>>> understand
>>>>   whether URI/header or both are the avenue to go forward
>>>
>>> I will be sad if we can't agree to have just the resource, unless
>>> we have a use case that cannot be satisfied by the separate
>>> resource space.
>>>
>>>> 2. A main goal of DNT (my perspective) is simplicity and ease of
>>>> use/understanding. I believe that the overall scheme should be
>>>> minimalistic to keep it as simple as possible. We spent time in
>>>> Brussels slimming the headers to the minimal info that is essential.
>>>> I'd like to do a similar exercise for your proposal.
>>>
>>> My proposal has many more details because it satisfies several more
>>> use cases than the header proposal.  For example, the echo DNT use
>>> case, the ability to distinguish specific exceptions, providing a
>>> list of domains to be considered first-party, extensibility, etc.
>>> It is a complete proposal and, IMO, vastly superior to sending a
>>> header field on every response because it satisfies all of the use
>>> cases without impacting existing implementations or caching.
>>>
>>> It has benefited from all of the prior discussion we have had on
>>> header fields.  It is just a different way to address the same
>>> problem (a more Web-centric, RESTful way, if I may add, though I
>>> bet somebody will eventually complain that application/json isn't a
>>> hypertext type).
>>>
>>>> This means that I would omit all fields that are not essential to
>>>> make
>>>> the proposal slim and similar to the headers.
>>>> - Fields I would remove are
>>>> same-site, edits, partners, received (we agreed that it is not
>>>> needed; it no longer exists in the headers either)
>>>
>>> That would eliminate the use cases for identifying the scope of
>>> first-party, providing individual control over the data that has
>>> been collected, providing fair warning (before the real resource
>>> request) of what third-party trackers are used by the site, and
>>> echoing the DNT field back to the client to detect evil
>>> intermediaries.  The only reason we don't have those cases handled
>>> by the header field is because it would be prohibitively expensive
>>> to do so in headers, either because of the size or because of the
>>> effect on the cacheability of normal responses.  Hence, your
>>> suggested deletions would remove most of the reasons why the
>>> resource fulfills the needs of the privacy and regulator folks
>>> better than the header field proposal.
>>>
>>> I am not wedded to the member names -- same-site just seemed more
>>> natural than first-party-scope.  I am not sure if we need the use
>>> case for partners (identifying third-parties before one goes to the
>>> site), since that may be too hard to manage, but it should at least
>>> be considered by the WG.
>>>
>>>> - I am not sure about the options as a separate field since the
>>>> policy
>>>> may link to it, too.
>>>
>>> Specific links to enable individual control is a requirement of the
>>> regulators.  They should not be buried in a policy doc.
>>>
>>>> - I also would focus on fields that are usually static (e.g.,  not
>>>> having a 'received' field)
>>>
>>> Why?  The main reason I wasn't able to convince folks at the start
>>> of the header field discussion that a well-known resource would
>>> satisfy their concerns is the preconception that such a resource is
>>> always just a file -- that it couldn't be dynamic enough.  This
>>> proposal demonstrates how dynamic it can be.
>>>
>>>> 3. I would fold 'tracking' and 'response' into a single field that
>>>> has
>>>> the same values as the headers (no-tracking, first-party,
>>>> service-provider, tracking)
>>>
>>> I have no interest in that change, for efficiency reasons.  Most
>>> sites do no tracking of any kind, and having that declared by a
>>> boolean up front allows for the use case of sites that don't want
>>> to be associated with the tracking-but-limited-to-exemptions sites.
>>>
>>>> 4. A new comment: While I understand the idea of the path field
>>>> (scoping of status objects), I do not understand its semantics enough.
>>>> E.g., I would not know what status object to apply if there are two
>>>> objects
>>>> Well-known URIPath in Object
>>>> /sub/
>>>> //sub
>>>
>>> The spec describes a specific algorithm for deciding it in 5.1.2:
>>>
>>>    A user agent may check the tracking status for a given resource
>>> URI by
>>>    making a retrieval request for the well-known address
>>>      /.well-known/dnt
>>>    relative to that URI.
>>>    ...
>>>
>>>    Once the tracking status representation is obtained, parse the
>>>    representation as JSON to extract the Javascript status-object.
>>>    If parsing results in a syntax error, the user agent should
>>>    consider the site to be non-conformant with this protocol.
>>>
>>>    If the status-object does not have a member named path or if the
>>> value
>>>    of path is not "/" and not a prefix of the path component for
>>> the URI
>>>    being checked, then find the service-specific tracking status
>>> resource
>>>    by taking the template
>>>       /.well-known/dnt{+pathinfo}
>>>    and replacing {+pathinfo} with the path component of the URI being
>>>    checked. Perform a retrieval request on the service-specific
>>> tracking
>>>    status resource and process the result as described above to obtain
>>>    the specific tracking status.
>>>
>>> Note that the second status-object retrieved is not examined to see
>>> if its path component is consistent -- it applies regardless.
>>>
>>>> Some more questions:
>>>> 1. Can there be multiple status-objects at one well-known URI?
>>>
>>> No, that is not allowed by the ABNF.
>>>
>>>> 2. We should attempt at finding a way to minimize the number of
>>>> requests to the well-known URI.
>>>
>>> We already have.  In almost all real cases, there will be exactly
>>> one per site per 24 hours (or longer if the site has declared a TTL
>>> for this response), and only then for user agents actively
>>> verifying the tracking status.  In all other cases, it is two
>>> requests, one for the base "/.well-known/dnt" and a second for a
>>> specific path.
>>> If a site wants to minimize secondary requests, it can do so by
>>> providing no more than one common path on their site per applicable
>>> policy, which is how URI delegation works naturally.
>>>
>>>
>>> Cheers,
>>>
>>> Roy T. Fielding                     <http://roy.gbiv.com/>
>>> Principal Scientist, Adobe Systems  <http://adobe.com/enterprise>
>>>
>>>
>>>
>>>
>>>
>>>
>>
> 
> ----------
> John M. Simpson
> Consumer Advocate
> Consumer Watchdog
> 1750 Ocean Park Blvd. ,Suite 200
> Santa Monica, CA,90405
> Tel: 310-392-7041
> Cell: 310-292-1902
> www.ConsumerWatchdog.org <http://www.ConsumerWatchdog.org>
> john@consumerwatchdog.org <mailto:john@consumerwatchdog.org>
>
Received on Monday, 5 March 2012 13:42:18 UTC