Re: Opting in to cookies - proposal from Jonas Sicking on 2008-06-14 (public-webapps@w3.org from April to June 2008)

From: Jonas Sicking <jonas@sicking.cc>
Date: Sat, 14 Jun 2008 04:23:10 -0700
To: Maciej Stachowiak <mjs@apple.com>
Cc: public-webapps@w3.org
Message-ID: <4853AA1E.1040404@sicking.cc>
Maciej Stachowiak wrote:
> 
> 
> On Jun 13, 2008, at 10:45 PM, Jonas Sicking wrote:
> 
>> Maciej Stachowiak wrote:
>>> On Jun 13, 2008, at 5:20 PM, Jonas Sicking wrote:
>>>>
>>>> Maciej Stachowiak wrote:
>>>>> On Jun 13, 2008, at 4:56 PM, Jonas Sicking wrote:
>>>>>>
>>>>>> Hi All,
>>>>>>
>>>>>> Since I haven't received any feedback on the various straw-men in 
>>>>>> the "Opting in to cookies" thread, I'll send a full proposal 
>>>>>> (wrote most of this yesterday, Thomas wrote some opinions on 
>>>>>> cookies this morning).
>>>>>>
>>>>>> First off, as before, when I talk about "cookies" in this mail I 
>>>>>> really
>>>>>> mean cookies + digest auth headers + any other headers that carry the
>>>>>> users credentials to a site. However i'll just use the term "cookies"
>>>>>> for readability, and since that is on the web currently the most
>>>>>> common carrier of credentials.
>>>>>>
>>>>>> So here goes:
>>>>>>
>>>>>> When loading a resource using access-control associate the request 
>>>>>> with
>>>>>> a "with credentials" flag.
>>>>>>
>>>>>> When the resource is loaded using an URI which starts with the string
>>>>>> "user-private:" set the "with credentials" flag to true. Otherwise 
>>>>>> set
>>>>>> it to false.
>>>>> How could an http or https URI start with the string 
>>>>> "user-private:"? Are you proposing a new URI scheme?
>>>>
>>>> My proposal is for nesting schemes, so you'd load 
>>>> user-private:http://example.com/address.php
>>> That strikes me as distasteful. I would prefer to see a separate 
>>> includeCrossSiteCredentials parameter on XHR somewhere than to use 
>>> in-band signalling via the URL, if we feel this is needed.
>>
>> The advantage with putting it as part of the URI is that it is agnostic
>> to which method used for loading. So for example inside XSLT you could 
>> do:
>>
>> <xsl:for-each
>>  select="document('user-private:http://example.com/adr')/adr/*">
>>  Name <xsl:value-of select="@name">
>> </xsl:for-each>
>>
>> to incorporate private data. This is one of the design goals of
>> Access-Control, that it is a generic loading mechanism, rather than
>> limited to a particular API.
> 
> The downsides of inventing a URI scheme include:
> 
> 1) URIs using this scheme will not parse into components properly (the 
> feed: scheme has this problem)
> 2) The scheme really should be registered through IANA, which will be a 
> bureaucratic hassle
> 3) IANA would probably be hesitant, because user-private: does not 
> describe a new resource access method, it just describes what headers 
> you want to send, which in http is separate from the URI
> 4) It is in fact a valid point that this violates the design of URI schemes

All good points. I guess I'm more ok with it since we already deal with 
this stuff in gecko a good bit, as we already do support nested schemas.

> 5) Code throughout the system will have to know to special-case this URI 
> scheme to treat it as equivalent to the corresponding HTTP URI

Not neccesarily, you'd just need to strip the "user-private:" part in 
the Access-Control implementation which means that most of the system 
would just see the HTTP URI.

> Compared to that, I think the benefit of retrieving user-specific XSL 
> stylesheets with credentials seems pretty low.
 >
> In general, even though Access-Control originated as a generic 
> mechanism, the XHR use case is clearly far more important to the Web 
> than any other and we shouldn't stuff out-of-band parameters in the URI 
> just for the sake of the other relatively minor use cases.

I do think there is a lot of value in a generic algorithm. Though I 
would be fine with doing the flag too. We can always consider a URI 
schema in a second version if there is a demand for it.

>>> However, I am not sure I understand the purpose of extra client-side 
>>> opt-in to cookies. Presumably the client side is cross-site and 
>>> therefore not under control of the server. So if the server is 
>>> careless about cross-site requests that include cookies or other 
>>> credentials, then I do not see how an opt-in mechanism on the client 
>>> side helps at all.
>>
>> The opt-in ultimately comes from the server through the
>> Access-Control-With-Credentials header. However for GET requests we 
>> don't know if the server is going to opt in or not at the time when 
>> the request is made. Therefore the client must opt in first, however 
>> if the client opts in but the server doesn't, the response from the 
>> server is discarded.
> 
> So it's just to save a round trip for the cookies case for GET requests? 
> (A round trip the first time only since the method check result may be 
> cached?)

If we didn't opt in on the client side we would effectively in very many 
cases have to require an extra OPTIONS request, even for public data. 
The reason is that often the user will have cookies set even on sites 
that just deal with public data. For example on craigslist I have two 
cookies set even though any mashup involving craigslist would just deal 
with public data.

> If you assume opting into cookies is common on the server side, then the 
> XHR implementation could send GETs with cookies but deny data and retry 
> the request unless the server gives an "I opted into cookies" header in 
> the response.

I considered that. However the result would likely be that sites would 
opt in to cookies just to lower the load on their servers, which would 
negate the usefulness of the cookie opt-in.

> If you assume the opposite is more likely then you can 
> send the GET without credentials first and retry if the server responds 
> that it would have wanted cookies (and you can cache this).

This is unlikely to be well received by the people that argued for the 
Access-Control-Policy-Path feature since it would double the number of 
requests if you were interacting with a number of distinct URIs. We 
could come up with some way of using Access-Control-Policy-Path directly 
in GET requests, but that feature is complicated enough as it is, and I 
would much prefer to keep GET requests simple as that is likely going to 
be the vastly common case.

> For the common case of XHR, we could also add a separate boolean 
> attribute specifying whether the client thinks the server wants cookies, 
> to suggest which of the above two guesses to take.
 >
 > That would work for all cases, with extra requests pretty rare, and
 > would not require inventing a URI scheme.

If we do that then I think we should just use that XHR boolean be the 
"with credentials" flag in my originally proposed algorithm. I.e. just 
do one request based on what the boolean says and fail if the server 
didn't agree.

> I must say though, this is starting to sound complex and I am not 
> totally convinced of the need to make servers opt in to getting cookies. 
> Is it really a likely mistake that someone would take affirmative steps 
> to enable cross-site access to a per-user resource, but then neglect to 
> check whether requests are cross-site and act appropriately?

I do think there is a big risk of that yes. I do think that many sites 
that today serve public data do have a few pages in there which contain 
forms or other pages that serve user specific data.

Even something as simple as a news site that largely serves public news 
articles might have a cookie where the user has chosen a home location 
for local news. This is the case on the news site I use for example, it 
usually just serves a standard list of news, but if I give it my home 
zip code, it will additionally serve a section of local news.

This is something that could very easily be overlooked by an 
administrator that just configures his server to add a "Access-Control: 
allow<*>" header using a site-wide configuration file, without going 
through all CGI scripts on the server and teaching the ones that honor 
cookies to ignore the cookies for cross-site requests.

> And if the 
> administrator of such a server thoughtlessly enabled cross-site access 
> without thinking about the consequences, would they not be equally 
> likely to enable cross-site cookies without thinking about the 
> consequences?

Not more likely than someone adding any other header without knowing 
what it does. This is why I designed my proposal such that opting in to 
cookies is a separate "step".

As described in my initial post here
http://lists.w3.org/Archives/Public/public-appformats/2008May/0144.html
very little security considerations needs to be done by a server 
operator that wants to publish public data if he can opt in without cookies.

When publishing private data there is always a lot more considerations 
that needs to be done. You have to make sure to ask the user first 
before sharing his/her data, you have to make sure to only serve the 
data that the user has opted in to (which likely means that you'd want 
to set up separate URIs). We can only hope that people that decide to 
share private data think before doing so.

> It seems like we are adding a lot of complexity (and therefore more 
> opportunity for implementation mistakes) for a marginal reduction in the 
> likelihood of server configuration errors.

I think the ability to separate sharing of private data from sharing of 
public data is a huge help for server operators. So I think this is much 
more than a marginal reduction of configuration errors.

I asked this in a separate thread already, but have you guys at apple 
had your security people look through the spec. Were they comfortable 
both with always sending cookies as well as always enabling all HTTP 
methods and headers?

/ Jonas
Received on Saturday, 14 June 2008 11:24:35 UTC