- From: Mike West <mkwst@google.com>
- Date: Tue, 21 Jun 2016 14:40:25 +0200
- To: Jeff Hodges <jeff.hodges@kingsmountain.com>
- Cc: Mark Goodwin <mgoodwin@mozilla.com>, IETF HTTP WG <ietf-http-wg@w3.org>
- Message-ID: <CAKXHy=fVmSru1G3zUiDDWuKhpNLXT-x6e1PEOEWzkWgu-twaPg@mail.gmail.com>
More thanks for more comments. :)
On Sat, Jun 18, 2016 at 12:21 AM, <jeff.hodges@kingsmountain.com> wrote:
> * the term registrable domain should be registered domain throughout
> (details
> below)
>
https://publicsuffix.org/list/ uses both. Do they have distinct meanings?
>
> * RFC6265 does not wrap its algorithm variables in double quotes (as this
> draft is doing), and also hyphenates multi-word variable names (even ones
> that
> aren't ABNF rule names). Are you suggesting that 6265bis ought to adopt
> the
> style used in this draft (where alg varbs are quoted), or will the below
> proposed updates to 6265bis adopt RFC6265's present style?
>
> I advocate for the latter -- e.g., I suggest: s/"site for
> cookies"/site-for-cookies/g
>
I'm used to W3C specs, where we can actually mark up variables in a way
that gives them semantic meaning. I'm happy to align this document with
IETF style; if hyphenated names are preferred, I'll start adding hyphens.
> * RFC6265 uses the terms host, server, domain in its algorithms, whereas
> this
> spec introduces the terms site, same site, top-level site, "site for
> cookies".
> This may cause impedance mismatches when attempting to merge this draft
> into
> 6265bis. Perhaps just something to be aware of and address at that time.
>
Agreed. I suspect we'll want to move some of these definitions elsewhere
(HTML, Fetch, and/or URL, for instance).
> * when comparing host names, or portions thereof, they ought to first be
> canonicalized per RFC6265 S 5.1.2, yes?
>
Yes.
> 1. Introduction
> >
> > Section 8.2 of [RFC6265] eloquently notes that cookies are a form of
>
> s/are a/may be employed as/
>
> > ambient authority, attached by default to requests the user agent
> > sends on a user's behalf. Even when an attacker doesn't know the
> > contents of a user's cookies, she can still execute commands on the
> > user's behalf (and with the user's authority) by asking the user
> > agent to send HTTP requests to unwary servers.
>
> I'd append to the end of latter sentence..
>
> which will include any previously-set cookies.
>
>
>
> > Here, we update [RFC6265] with a simple mitigation strategy that
> > allows servers to declare certain cookies as "same-site", meaning
> > they should not be attached to "cross-site" requests (as defined in
> > section 2.1).
>
> s/2.1)/2.1 of this specification)/
>
> it isn't immediately clear whether the above is referring to section 2.1 of
> this spec or RFC6265.
>
>
> > Note that the mechanism outlined here is backwards compatible with
> > the existing cookie syntax. Servers may serve these cookies to all
> > user agents; those that do not support the "SameSite" attribute will
>
> RFC6265 does not quote attribute names in prose, it would be written as
> SameSite attribute. this style is used in the below comments.
>
Hrm. It seems valuable to in some way distinguish between "code" and
"prose". If this was an HTML document, I'd wrap `SameSite` in a `<code>`
tag. Again, I'm not familar with IETF style, so I'll defer to you on it,
but it seems an unfortunate choice.
>
>
> > simply store a cookie which is attached to all relevant requests,
> > just as they do today.
>
> suggested mod to latter sentence:
>
> simply store a cookie, which is subsequently attached to all relevant
> requests (as defined by [RFC6265]), just as they do today.
>
>
>
> > 1.1. Goals
>
> might these be items that should be added into 6265bis' various
> "considerations" sections (as appropriate)?
>
I think that's reasonable in the combined document, but in a stand-alone
document targeting a specific feature, it seems reasonable to be explicit
about that feature's raison d'etre right up front.
> >
> > These cookies are intended to provide a solid layer of defense-in-
>
> s/these/Same-site/
>
Sure.
> > depth against attacks which require embedding an authenticated
> > request into an attacker-controlled context:
> >
> > 1. Timing attacks which yield cross-origin information leakage (such
> > as those detailed in [pixel-perfect]) can be substantially
> > mitigated by setting the "SameSite" attribute on authentication
> > cookies. The attacker will only be able to embed unauthenticated
> > resources, as embedding mechanisms such as "<iframe>" will yield
> > cross-site requests.
> >
> > 2. Cross-site script inclusion (XSSI) attacks are likewise mitigated
> > by setting the "SameSite" attribute on authentication cookies.
> > The attacker will not be able to include authenticated resources
> > via "<script>" or "<link>", as these embedding mechanisms will
> > likewise yield cross-site requests.
>
> do you actually mean `<script src="..." />` rather than
> `<script>...</script>`
> here?
>
Yes, but I'm not sure it's valuable to make that distinction. I meant to
refer to "the <script> tag" here, as that's the mechanism by which the
request would be triggered.
> 1.2. Examples
> >
> > Same-site cookies are set via the "SameSite" attribute in the "Set-
>
> s/set/declared/ ?
>
The header is called `Set-Cookie`. Given that, "set" seems like the right
verb here.
> > Subsequent requests from that user agent can be expected to contain
> > the following header field if and only if both the requested resource
> > and the resource in the top-level browsing context match the cookie.
>
>
> missing the example header field that ostensibly should appear here?
>
Indeed, thanks!
> > 2. Terminology and notation
>
> is the intention that the terminology in this immediate section would be
> added
> to 6265bis' section 2.3 "Terminology" ?
>
Yup. Or elsewhere, as noted above. I've talked a bit with Anne about moving
some things from here into HTML/URL/Fetch, and I think he's amenable.
> >
> > The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
> > "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
> > document are to be interpreted as described in [RFC2119].
> >
> > This specification uses the Augmented Backus-Naur Form (ABNF)
> > notation of [RFC5234].
> >
> > Two sequences of octets are said to case-insensitively match each
> > other if and only if they are equivalent under the "i;ascii-casemap"
> > collation defined in [RFC4790].
> >
> > The terms "active document", "ancestor browsing context", "browsing
> > context", "document", "WorkerGlobalScope", "sandboxed origin browsing
> > context flag", "parent browsing context", "the worker's Documents",
> > "nested browsing context", and "top-level browsing context" are
> > defined in [HTML].
>
> add to the above list:
>
> Document, shared worker, dedicated worker
>
Document was already in the list (though uncapitalized), I've added the
other two. Thanks!
> > The term "public suffix" is defined in a note in Section 5.3 of
> > [RFC6265] as "a domain that is controlled by a public registry". For
> > example, "example.com"'s public suffix is "com". User agents SHOULD
> > use an up-to-date public suffix list, such as the one maintained by
> > Mozilla at [PSL].
>
> I would add -- "Public suffixes" are also known as "effective top-level
> domains (eTLDs). -- since that latter term is also used in many places in
> the
> wild.
>
Sure.
> > An origin's "registrable domain" is the origin's host's public suffix
> > plus the label to its left. That is, "https://www.example.com"'s
> > registrable domain is "example.com". This concept is defined more
> > rigorously in [PSL].
>
> I suggest: s/registrable domain/registered domain/g
>
> I've inquired with <team@publicsuffix.org> regarding the "registrable
> domain"
> and "registered domain" terms and it seems that this spec should be using
> "registered domain".
>
> see also..
> https://github.com/publicsuffix/list/issues/236
> https://github.com/publicsuffix/publicsuffix.org/pull/2
>
> suggested rewrite of above parag..
>
> An origin's "registered domain" is the origin's host's public suffix
> plus the domain name label to its left. That is, for
> "https://www.example.com", the public suffix is "com" and the
> registered domain is "example.com". This concept is defined more
> rigorously in [PSL], and is also known as "effective top-level
> domain plus one (eTLD+1)".
>
Ok, thanks for this answer to the question I asked at the top. :)
> [ aside fwiw: I don't think the algorithm or terminology used on
> <https://publicsuffix.org/list/> is terribly rigorous and it could use
> some
> improvement ]
>
Probably worthwhile, yes.
> > The term "request", as well as a request's "client", "current url",
> > "method", and "target browsing context", are defined in [FETCH].
>
> suggest adding request's "initiator" to the list above since it is used in
> opening parag of next section
>
Actually, I need to rename the thing I want to talk about here. Fetch
defines an "initiator" which isn't what I want. What I want is the origin
of the request's client.
Are the below definitions & algorithms in S 2.1 et al slated to be inserted
> into 6265bis, e.g., in 6265bis S 5 "User Agent Requirements"?
>
> Also, before diving into same-/cross-site, document-based, worker-based
> requests and all, I suggest adding a section here defining the overall
> concept
> of site-for-cookies...
>
>
> 2.1 The Site-for-Cookies Concept
>
> A request is the input to the HTML Fetch algorithm [FETCH].
> Initiators of requests have an associated origin, which may
> contain a host name. If so, then the host name contains a
> registered domain, which is the top-most domain on which
> the initiator server-side may set its cookies. Thus this
> registered domain is the initiator's site-for-cookies,
> which is used in determining whether a request is same-site
> or not.
>
>
>
> > 2.1. "Same-site" and "cross-site" Requests
> >
> > A request is "same-site" if its target's URI's origin's registrable
> > domain is an exact match for the request's initiator's "site for
> > cookies", and "cross-site" otherwise. To be more precise, for a
> > given request ("request"), the following algorithm returns "same-
> > site" or "cross-site":
> >
> > 1. If "request"'s client is "null", return "same-site".
> >
> > 2. Let "site" be "request"'s client's "site for cookies" (as defined
> > in the following sections).
> >
> > 3. Let "target" be the registrable domain of "request"'s current
> > url.
> >
> > 4. If "site" is an exact match for "target", return "same-site".
> >
> > 5. Return "cross-site".
>
>
> ISTM there's various issues with the above, below is a suggested overall
> revision.
>
> also, I wonder whether it is a good idea to have this particular
> definition to
> be characterized as an algorithm if they are not themselves actually
> incorporated within the normative 6265bis cookie-processing algorithms.
> Thus
> the below is not characterized as an algorithm...
>
Thank you for this thoughtful reformulation. I think I'll take a stab at
moving these around out of this document first, and will try to incorporate
your feedback into those patches to the external documents on which this
relies.
> 3. Server Requirements
> >
> > This section describes extensions to [RFC6265] necessary to implement
> ^
> section 4
>
> > the server-side requirements of the "SameSite" attribute.
> >
> > 3.1. Grammar
> >
> > Add "SameSite" to the list of accepted attributes in the "Set-Cookie"
> > header field's value by replacing the "cookie-av" token definition in
> > Section 4.1.1 of [RFC6265] with the following ABNF grammar:
> >
> > cookie-av = expires-av / max-age-av / domain-av /
> > path-av / secure-av / httponly-av /
> > samesite-av / extension-av
> > samesite-av = "SameSite" / "SameSite=" samesite-value
> ^^^^^^^
> this allows for a SameSite attr without a value?
> is that actually used somewhere in the below algorithms?
>
No, this is https://github.com/mikewest/internetdrafts/issues/11
<https://github.com/mikewest/internetdrafts/issues/11> J which I'll fix now
that I finally got around to uploading a -00 draft with the contents of the
existing -07.
>
> > samesite-value = "Strict" / "Lax"
> >
> >
> >
> > 3.2. Semantics of the "SameSite" Attribute (Non-Normative)
> >
> > The "SameSite" attribute limits the scope of the cookie such that it
> > will only be attached to requests if those requests are "same-site",
> > as defined by the algorithm in Section 2.1.
> ^^^^^^^^^^^^^^^^
> delete
> > For example, requests
> > for "https://example.com/sekrit-image" will attach same-site cookies
> > if and only if initiated from a context whose "site for cookies" is
> > "example.com".
> >
> > If the "SameSite" attribute's value is "Strict", or if the value is
> > invalid, the cookie will only be sent along with "same-site"
> > requests.
>
> suggest s/"same-site" request/same-site request/g
>
> > If the value is "Lax", the cookie will be sent with "same-
> > site" requests, and with "cross-site" top-level navigations, as
> > described in Section 4.1.1.
>
> what if the SameSite attribute has no attribute value as allowed by the
> ABNF
> above?
>
>
> > The changes to the "Cookie" header field suggested in Section 4.3
> > provide additional detail.
> >
> > 4. User Agent Requirements
> >
> > This section describes extensions to [RFC6265] necessary in order to
> ^
> Section 5
>
>
> > implement the client-side requirements of the "SameSite" attribute.
> >
> > 4.1. The "SameSite" attribute
> >
> > The following attribute definition should be considered part of the
> > the "Set-Cookie" algorithm as described in Section 5.2 of [RFC6265]:
> > If the "attribute-name" case-insensitively matches the string
> > "SameSite", the user agent MUST process the "cookie-av" as follows:
> >
> > 1. If "cookie-av"'s "attribute-value" is not a case-sensitive match
> > for "Strict" or "Lax", ignore the "cookie-av".
>
> should the cookie-av's attribute-value match here be case-insensitive since
> the match in the following rule is case-insensitive?
>
Yes, these should both be case-insensitive.
> also, what if attribute-value is empty becase SameSite had no value ?
> though
> I suppose the above rule addresses that i.e. it would ignore the cookie-av
> ?
>
That is the intent, yes.
> > 2. Let "enforcement" be "Lax" if "cookie-av"'s "attribute-value" is
> > a case-insensitive match for "Lax", and "Strict" otherwise.
> >
> > 3. Append an attribute to the "cookie-attribute-list" with an
> > "attribute-name" of "SameSite" and an "attribute-value" of
> > "enforcement".
>
> oh, so enforcement here is just a local algorithm variable? and step 3 is
> actually saying to use the value of enforcement as the attr-value for
> SameSite
> ? if so, perhaps ought to be clarified..
>
I understand sections 5.2.* of RFC6265 to be accepting an attribute-name
and attribute-value string (see step 6 of that document's 5.2). The
attribute that's appended to 'cookie-attribute-list' has its own attribute
name and value.
I agree that this is a little confusing, and that we could make the
expected inputs/outputs clearer when we revise the document.
> > 4.1.1. "Strict" and "Lax" enforcement
> >
> > By default, same-site cookies will not be sent along with top-level
> > navigations. As discussed in Section 5.2, this might or might not be
> > compatible with existing session management systems. In the
> > interests of providing a drop-in mechanism that mitigates the risk of
> > CSRF attacks, developers may set the "SameSite" attribute in a "Lax"
> > enforcement mode that carves out an exception which sends same-site
> > cookies along with cross-site requests if and only if they are top-
> > level navigations which use a "safe" (in the [RFC7231] sense) HTTP
> > method.
> >
> > Lax enforcement provides reasonable defense in depth against CSRF
> > attacks that rely on unsafe HTTP methods (like "POST"), but do not
>
> s/do not/does not/ ?
>
Indeed!
> > offer a robust defense against CSRF as a general category of attack:
> >
> > 1. Attackers can still pop up new windows or trigger top-level
> > navigations in order to create a "same-site" request (as
> > described in section 2.1), which is only a speedbump along the
> > road to exploitation.
> >
> > 2. Features like "<link rel='prerender'>" [prerendering] can be
> > exploited to create "same-site" requests without the risk of user
> > detection.
> >
> > When possible, developers should use a session management mechanism
> > such as that described in Section 5.2 to mitigate the risk of CSRF
> > more completely.
>
> "Strict" enforcement is not explicitly defined?
>
> hm, i guess it is defined in S 3.2 -- perhaps add a x-ref to that?
>
Clarified this in the first sentence (followed by a link to 5.2).
> 4.2. Monkey-patching the Storage Model
> >
> > Note: There's got to be a better way to specify this. Until I figure
> > out what that is, monkey-patching!
> >
> > Alter Section 5.3 of [RFC6265] as follows:
> >
> > 1. Add "samesite-flag" to the list of fields stored for each cookie.
>
> s/Add/Add (in the first paragraph)/ -- the list of fields is in the first
> parag of S 5.3
>
Poked at this.
> > This field's value is one of "None", "Strict", or "Lax".
>
> i'd prefix the above sentence with 'Note: '
>
Sure. I think we could be a bit clearer with 5.3's definition entirely if
we gave a normative description of possible values/types for each of the
flags, but as-is, a note is reasonable.
> > 2. Before step 11 of the current algorithm, add the following:
>
> the below are to be new top-level algorithm steps?
>
These would be new steps 11 and 12. The current step 11 would be bumped
down to 13, and subsequent steps would follow along as 14, 15, etc.
> > 1. If the "cookie-attribute-list" contains an attribute with an
> > "attribute-name" of "SameSite", set the cookie's "samesite-
> > flag" to "attribute-value" ("Strict" or "Lax"). Otherwise,
> > set the cookie's "samesite-flag" to "None".
> >
> > 2. If the cookie's "samesite-flag" is not "None", and the
> > request which generated the cookie's client's "site for
> > cookies"
>
> the term "cookie's client('s)" is used only here (and also isn't defined in
> RFC6265) -- what is it's definition?
>
It's actually referring to "request's client" from FETCH.
"request-which-generated-the-cookie's client". This would, I'm sure, be
simpler in German. :)
> is not an exact match for "request-uri"'s host's
> > registrable domain, then abort these steps and ignore the
> > newly created cookie entirely.
>
> so the above step applies whether the enforcement policy is lax or strict?
>
Correct. The intent is to store cookies with the SameSite attribute only
when set in a same-site context.
> > 4.3. Monkey-patching the "Cookie" header
> >
> > Note: There's got to be a better way to specify this. Until I figure
> > out what that is, monkey-patching!
> >
> > Alter Section 5.4 of [RFC6265] as follows:
> >
> > 1. Add the following requirement to the list in step 1:
>
> to the end of the existing bullet list in [RFC6265] ?
>
Yes. I've clarified this.
> > * If the cookie's "samesite-flag" is not "None", and the HTTP
> > request is cross-site (as defined in Section 2.1 then exclude
> ^
> )
>
Done.
> > 5. Authoring Considerations
>
> this section is intended to be added to 6265bis?
>
That's up to the group. I think it makes sense to describe how we intend
for developers to actually use this thing. It certainly makes sense in the
context of this document, and I think it probably also makes sense as a new
section in 6265bis that describes authoring considerations with this and
other attributes.
> > Developers can avoid this confusion by adopting a session management
> > system that relies on not one, but two cookies: one conceptualy
>
> s/conceptualy/conceptually/
>
Thanks!
> > granting "read" access, another granting "write" access. The latter
> > could be marked as "SameSite", and its absence would provide a
>
> s/provide/prompt/ ?
>
Yes.
>
> > reauthentication step before executing any non-idempotent action.
> > The former could drop the "SameSite" attribute entirely, or choose
> > the "Lax" version of enforcement, in order to allow users access to
> > data via top-level navigation.
>
> so the above is based on the sites using cookie-based ambient authn (yes?),
> which is a reasonable assumption I suppose but perhaps should be stated
> explicitly?
>
I added "cookie-based" to the first paragraph of the section, which I hope
helps clarify.
> 5.3. Mashups and Widgets
> >
> > The "SameSite" attribute is inappropriate for some important use-
> > cases. In particular, note that content intended for embedding in a
> > cross-site contexts (social networking widgets or commenting
> > services, for instance) will not have access to such cookies.
>
> by "such cookies" you mean same-site cookies ?
>
Clarified.
>
> > Cross-
> > site cookies may be required in order to provide seamless
> > functionality that relies on a user's state.
>
> by "cross-site cookies" you mean cookies lacking the SameSite attribute?
> this
> is the only place this term is used at this time...
>
Reworded.
> > Likewise, some forms of Single-Sign-On might require authentication
>
> by "authentication" here you mean ambient cookie-based authn ?
>
Yes. Made that explicit.
> > in a cross-site context; these mechanisms will not function as
> > intended with same-site cookies.
>
> and thus will need to (continue) to use so-called cross-site cookies ?
>
Yes, but that seems clear from context.
> > 6. Privacy Considerations
>
> the below are to be added to 6265bis ?
Yes.
> >
> > 6.1. Server-controlled
> >
> > Same-site cookies in and of themselves don't do anything to address
> > the general privacy concerns outlined in Section 7.1 of [RFC6265].
> > The attribute is set by the server, and serves to mitigate the risk
> ^
> SameSite ?
>
> > of certain kinds of attacks that the server is worried about. The
> > user is not involved in this decision. Moreover, a number of side-
> > channels exist which could allow a server to link distinct requests
> > even in the absence of cookies.
>
> you mean to say "..could allow a third-party server to link distinct
> top-level
> navigation requests to first-party servers, even in the absence of
> cookies." ?
>
"top-level navigation" is too limiting. Servers receive channel ID/token
binding data for any request, for instance, not just navigational requests.
> > Connection and/or socket pooling,
> > Token Binding, and Channel ID all offer explicit methods of
> > identification that servers could take advantage of.
>
> I'm not sure the above sentence is applicable to the third-party cookie
> discussion in RFC6265 S 7.1. My understanding is that Token Binding and
> ChannelID present server-specific identifiers and cannot be used by a
> third-party server to track the UA's top-level navigation requests to
> various
> first-party servers.
>
Both these mechanism send origin-specific data to a server as part of a
request. They have similar properties to cookies, in that they can be used
to correlate requests in a top-level or nested or subresource context.
They're not only sent along with top-level requests, and their privacy
concerns aren't limited to those top-level contexts (e.g. `google.com` in a
frame gets a channel ID as part of the TLS handshake. The same ID, in fact,
that it receives in a top-level context).
> > 6.2. Pervasive Monitoring
> >
> > As outlined in [RFC7258], pervasive monitoring is an attack. Cookies
> > play a large part in enabling such monitoring, as they are
> > responsible for maintaining state in HTTP connections. We considered
> > restricting same-site cookies to secure contexts [secure-contexts] as
> > a mitigation but decided against doing so, as this feature should
>
> by "this feature" you mean same-site cookies ?
>
Correct.
>
> > result in a strict reduction in the number of cookies floating around
> > in cross-site contexts. That is, even if "http://not-example.com"
> > embeds a resource from "http://example.com/", that resource will not
> > be "same-site", and "http://example.com"'s cookies simply cannot be
> > used to correlate user behavior across distinct origins.
> >
> > 7. References
>
>
> <snip/>
>
> > [app-isolation]
> > Chen, E., Bau, J., Reis, C., Barth, A., and C. Jackson,
> > "App Isolation - Get the Security of Multiple Browsers
> > with Just One", n.d.,
> > <http://www.collinjackson.com/research/papers/
> > appisolation.pdf>.
>
> the date of this paper appears to be 2011..
>
> CCS’11, October 17–21, 2011, Chicago, Illinois, USA.
> Copyright 2011 ACM 978-1-4503-0948-6/11/10
>
>
> end
>
>
>
Thank you, this was super-helpful. :)
I've made a number of changes in
https://github.com/httpwg/http-extensions/commit/263db30abb6f0f0b5a0cc3c2c3424108b3c171bc,
which I hope addresses the above. I'll make more changes to the various
algorithms in patches to HTML, Fetch, URL, et al.
-mike
Received on Tuesday, 21 June 2016 12:41:20 UTC