Re: comments on draft-west-first-party-cookies-07

More thanks for more comments. :)

On Sat, Jun 18, 2016 at 12:21 AM, <jeff.hodges@kingsmountain.com> wrote:

> * the term registrable domain should be registered domain throughout
> (details
> below)
>

https://publicsuffix.org/list/ uses both. Do they have distinct meanings?


>
> * RFC6265 does not wrap its algorithm variables in double quotes (as this
> draft is doing), and also hyphenates multi-word variable names (even ones
> that
> aren't ABNF rule names). Are you suggesting that  6265bis ought to adopt
> the
> style used in this draft (where alg varbs are quoted), or will the below
> proposed updates to 6265bis adopt RFC6265's present style?
>
> I advocate for the latter -- e.g., I suggest: s/"site for
> cookies"/site-for-cookies/g
>

I'm used to W3C specs, where we can actually mark up variables in a way
that gives them semantic meaning. I'm happy to align this document with
IETF style; if hyphenated names are preferred, I'll start adding hyphens.


> * RFC6265 uses the terms host, server, domain in its algorithms, whereas
> this
> spec introduces the terms site, same site, top-level site, "site for
> cookies".
>  This may cause impedance mismatches when attempting to merge this draft
> into
> 6265bis. Perhaps just something to be aware of and address at that time.
>

Agreed. I suspect we'll want to move some of these definitions elsewhere
(HTML, Fetch, and/or URL, for instance).


> * when comparing host names, or portions thereof, they ought to first be
> canonicalized per RFC6265 S 5.1.2, yes?
>

Yes.

> 1.  Introduction
> >
> >    Section 8.2 of [RFC6265] eloquently notes that cookies are a form of
>
> s/are a/may be employed as/
>
> >    ambient authority, attached by default to requests the user agent
> >    sends on a user's behalf.  Even when an attacker doesn't know the
> >    contents of a user's cookies, she can still execute commands on the
> >    user's behalf (and with the user's authority) by asking the user
> >    agent to send HTTP requests to unwary servers.
>
> I'd append to the end of latter sentence..
>
>   which will include any previously-set cookies.
>
>
>
> >    Here, we update [RFC6265] with a simple mitigation strategy that
> >    allows servers to declare certain cookies as "same-site", meaning
> >    they should not be attached to "cross-site" requests (as defined in
> >    section 2.1).
>
> s/2.1)/2.1 of this specification)/
>
> it isn't immediately clear whether the above is referring to section 2.1 of
> this spec or RFC6265.
>
>
> >    Note that the mechanism outlined here is backwards compatible with
> >    the existing cookie syntax.  Servers may serve these cookies to all
> >    user agents; those that do not support the "SameSite" attribute will
>
> RFC6265 does not quote attribute names in prose, it would be written as
> SameSite attribute. this style is used in the below comments.
>

Hrm. It seems valuable to in some way distinguish between "code" and
"prose". If this was an HTML document, I'd wrap `SameSite` in a `<code>`
tag. Again, I'm not familar with IETF style, so I'll defer to you on it,
but it seems an unfortunate choice.


>
>
> >    simply store a cookie which is attached to all relevant requests,
> >    just as they do today.
>
> suggested mod to latter sentence:
>
>       simply store a cookie, which is subsequently attached to all relevant
> requests (as defined by [RFC6265]), just as they do today.
>
>
>
> > 1.1.  Goals
>
> might these be items that should be added into 6265bis' various
> "considerations" sections (as appropriate)?
>

I think that's reasonable in the combined document, but in a stand-alone
document targeting a specific feature, it seems reasonable to be explicit
about that feature's raison d'etre right up front.


> >
> >    These cookies are intended to provide a solid layer of defense-in-
>
> s/these/Same-site/
>

Sure.


> >    depth against attacks which require embedding an authenticated
> >    request into an attacker-controlled context:
> >
> >    1.  Timing attacks which yield cross-origin information leakage (such
> >        as those detailed in [pixel-perfect]) can be substantially
> >        mitigated by setting the "SameSite" attribute on authentication
> >        cookies.  The attacker will only be able to embed unauthenticated
> >        resources, as embedding mechanisms such as "<iframe>" will yield
> >        cross-site requests.
> >
> >    2.  Cross-site script inclusion (XSSI) attacks are likewise mitigated
> >        by setting the "SameSite" attribute on authentication cookies.
> >        The attacker will not be able to include authenticated resources
> >        via "<script>" or "<link>", as these embedding mechanisms will
> >        likewise yield cross-site requests.
>
> do you actually mean `<script src="..." />` rather than
> `<script>...</script>`
> here?
>

Yes, but I'm not sure it's valuable to make that distinction. I meant to
refer to "the <script> tag" here, as that's the mechanism by which the
request would be triggered.

> 1.2.  Examples
> >
> >    Same-site cookies are set via the "SameSite" attribute in the "Set-
>
> s/set/declared/ ?
>

The header is called `Set-Cookie`. Given that, "set" seems like the right
verb here.


> >    Subsequent requests from that user agent can be expected to contain
> >    the following header field if and only if both the requested resource
> >    and the resource in the top-level browsing context match the cookie.
>
>
> missing the example header field that ostensibly should appear here?
>

Indeed, thanks!


> > 2.  Terminology and notation
>
> is the intention that the terminology in this immediate section would be
> added
> to 6265bis' section 2.3 "Terminology" ?
>

Yup. Or elsewhere, as noted above. I've talked a bit with Anne about moving
some things from here into HTML/URL/Fetch, and I think he's amenable.


> >
> >    The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
> >    "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
> >    document are to be interpreted as described in [RFC2119].
> >
> >    This specification uses the Augmented Backus-Naur Form (ABNF)
> >    notation of [RFC5234].
> >
> >    Two sequences of octets are said to case-insensitively match each
> >    other if and only if they are equivalent under the "i;ascii-casemap"
> >    collation defined in [RFC4790].
> >
> >    The terms "active document", "ancestor browsing context", "browsing
> >    context", "document", "WorkerGlobalScope", "sandboxed origin browsing
> >    context flag", "parent browsing context", "the worker's Documents",
> >    "nested browsing context", and "top-level browsing context" are
> >    defined in [HTML].
>
> add to the above list:
>
>               Document, shared worker, dedicated worker
>

Document was already in the list (though uncapitalized), I've added the
other two. Thanks!


> >    The term "public suffix" is defined in a note in Section 5.3 of
> >    [RFC6265] as "a domain that is controlled by a public registry".  For
> >    example, "example.com"'s public suffix is "com".  User agents SHOULD
> >    use an up-to-date public suffix list, such as the one maintained by
> >    Mozilla at [PSL].
>
> I would add -- "Public suffixes" are also known as "effective top-level
> domains (eTLDs). -- since that latter term is also used in many places in
> the
> wild.
>

Sure.


> >    An origin's "registrable domain" is the origin's host's public suffix
> >    plus the label to its left.  That is, "https://www.example.com"'s
> >    registrable domain is "example.com".  This concept is defined more
> >    rigorously in [PSL].
>
> I suggest:  s/registrable domain/registered domain/g
>
> I've inquired with  <team@publicsuffix.org> regarding the "registrable
> domain"
> and "registered domain" terms and it seems that this spec should be using
> "registered domain".
>
>   see also..
>   https://github.com/publicsuffix/list/issues/236
>   https://github.com/publicsuffix/publicsuffix.org/pull/2
>
> suggested rewrite of above parag..
>
>    An origin's "registered domain" is the origin's host's public suffix
>    plus the domain name label to its left.  That is, for
>    "https://www.example.com", the public suffix is "com" and the
>    registered domain is "example.com".  This concept is defined more
>    rigorously in [PSL], and is also known as "effective top-level
>    domain plus one (eTLD+1)".
>

Ok, thanks for this answer to the question I asked at the top. :)


> [ aside fwiw: I don't think the algorithm or terminology used on
> <https://publicsuffix.org/list/> is terribly rigorous and it could use
> some
> improvement ]
>

Probably worthwhile, yes.


> >    The term "request", as well as a request's "client", "current url",
> >    "method", and "target browsing context", are defined in [FETCH].
>
> suggest adding request's "initiator" to the list above since it is used in
> opening parag of next section
>

Actually, I need to rename the thing I want to talk about here. Fetch
defines an "initiator" which isn't what I want. What I want is the origin
of the request's client.

Are the below definitions & algorithms in S 2.1 et al slated to be inserted
> into 6265bis, e.g., in 6265bis S 5 "User Agent Requirements"?
>
> Also, before diving into same-/cross-site, document-based, worker-based
> requests and all, I suggest adding a section here defining the overall
> concept
> of site-for-cookies...
>
>
> 2.1 The Site-for-Cookies Concept
>
>    A request is the input to the HTML Fetch algorithm [FETCH].
>    Initiators of requests have an associated origin, which may
>    contain a host name. If so, then the host name contains a
>    registered domain, which is the top-most domain on which
>    the initiator server-side may set its cookies. Thus this
>    registered domain is the initiator's site-for-cookies,
>    which is used in determining whether a request is same-site
>    or not.
>
>
>
> > 2.1.  "Same-site" and "cross-site" Requests
> >
> >    A request is "same-site" if its target's URI's origin's registrable
> >    domain is an exact match for the request's initiator's "site for
> >    cookies", and "cross-site" otherwise.  To be more precise, for a
> >    given request ("request"), the following algorithm returns "same-
> >    site" or "cross-site":
> >
> >    1.  If "request"'s client is "null", return "same-site".
> >
> >    2.  Let "site" be "request"'s client's "site for cookies" (as defined
> >        in the following sections).
> >
> >    3.  Let "target" be the registrable domain of "request"'s current
> >        url.
> >
> >    4.  If "site" is an exact match for "target", return "same-site".
> >
> >    5.  Return "cross-site".
>
>
> ISTM there's various issues with the above, below is a suggested overall
> revision.
>
> also, I wonder whether it is a good idea to have this particular
> definition to
> be characterized as an algorithm if they are not themselves actually
> incorporated within the normative 6265bis cookie-processing algorithms.
> Thus
> the below is not characterized as an algorithm...
>

Thank you for this thoughtful reformulation. I think I'll take a stab at
moving these around out of this document first, and will try to incorporate
your feedback into those patches to the external documents on which this
relies.

> 3.  Server Requirements
> >
> >    This section describes extensions to [RFC6265] necessary to implement
>                                                    ^
>                                                section 4
>
> >    the server-side requirements of the "SameSite" attribute.
> >
> > 3.1.  Grammar
> >
> >    Add "SameSite" to the list of accepted attributes in the "Set-Cookie"
> >    header field's value by replacing the "cookie-av" token definition in
> >    Section 4.1.1 of [RFC6265] with the following ABNF grammar:
> >
> >    cookie-av      = expires-av / max-age-av / domain-av /
> >                     path-av / secure-av / httponly-av /
> >                     samesite-av / extension-av
> >    samesite-av    = "SameSite" / "SameSite=" samesite-value
>                         ^^^^^^^
>       this allows for a SameSite attr without a value?
>       is that actually used somewhere in the below algorithms?
>

No, this is https://github.com/mikewest/internetdrafts/issues/11
<https://github.com/mikewest/internetdrafts/issues/11> J which I'll fix now
that I finally got around to uploading a -00 draft with the contents of the
existing -07.


>
> >    samesite-value = "Strict" / "Lax"
> >
> >
> >
> > 3.2.  Semantics of the "SameSite" Attribute (Non-Normative)
> >
> >    The "SameSite" attribute limits the scope of the cookie such that it
> >    will only be attached to requests if those requests are "same-site",
> >    as defined by the algorithm in Section 2.1.
>                 ^^^^^^^^^^^^^^^^
>                       delete
> >                                                   For example, requests
> >    for "https://example.com/sekrit-image" will attach same-site cookies
> >    if and only if initiated from a context whose "site for cookies" is
> >    "example.com".
> >
> >    If the "SameSite" attribute's value is "Strict", or if the value is
> >    invalid, the cookie will only be sent along with "same-site"
> >    requests.
>
> suggest s/"same-site" request/same-site request/g
>
> >             If the value is "Lax", the cookie will be sent with "same-
> >    site" requests, and with "cross-site" top-level navigations, as
> >    described in Section 4.1.1.
>
> what if the SameSite attribute has no attribute value as allowed by the
> ABNF
> above?
>
>
> >    The changes to the "Cookie" header field suggested in Section 4.3
> >    provide additional detail.
> >
> > 4.  User Agent Requirements
> >
> >    This section describes extensions to [RFC6265] necessary in order to
>                                                    ^
>                                                Section 5
>
>
> >    implement the client-side requirements of the "SameSite" attribute.
> >
> > 4.1.  The "SameSite" attribute
> >
> >    The following attribute definition should be considered part of the
> >    the "Set-Cookie" algorithm as described in Section 5.2 of [RFC6265]:
> >    If the "attribute-name" case-insensitively matches the string
> >    "SameSite", the user agent MUST process the "cookie-av" as follows:
> >
> >    1.  If "cookie-av"'s "attribute-value" is not a case-sensitive match
> >        for "Strict" or "Lax", ignore the "cookie-av".
>
> should the cookie-av's attribute-value match here be case-insensitive since
> the match in the following rule is case-insensitive?
>

Yes, these should both be case-insensitive.


> also, what if attribute-value is empty becase SameSite had no value ?
> though
> I suppose the above rule addresses that i.e. it would ignore the cookie-av
> ?
>

That is the intent, yes.


> >    2.  Let "enforcement" be "Lax" if "cookie-av"'s "attribute-value" is
> >        a case-insensitive match for "Lax", and "Strict" otherwise.
> >
> >    3.  Append an attribute to the "cookie-attribute-list" with an
> >        "attribute-name" of "SameSite" and an "attribute-value" of
> >        "enforcement".
>
> oh, so enforcement here is just a local algorithm variable?  and step 3 is
> actually saying to use the value of enforcement as the attr-value for
> SameSite
> ?  if so, perhaps ought to be clarified..
>

I understand sections 5.2.* of RFC6265 to be accepting an attribute-name
and attribute-value string (see step 6 of that document's 5.2). The
attribute that's appended to 'cookie-attribute-list' has its own attribute
name and value.

I agree that this is a little confusing, and that we could make the
expected inputs/outputs clearer when we revise the document.


> > 4.1.1.  "Strict" and "Lax" enforcement
> >
> >    By default, same-site cookies will not be sent along with top-level
> >    navigations.  As discussed in Section 5.2, this might or might not be
> >    compatible with existing session management systems.  In the
> >    interests of providing a drop-in mechanism that mitigates the risk of
> >    CSRF attacks, developers may set the "SameSite" attribute in a "Lax"
> >    enforcement mode that carves out an exception which sends same-site
> >    cookies along with cross-site requests if and only if they are top-
> >    level navigations which use a "safe" (in the [RFC7231] sense) HTTP
> >    method.
> >
> >    Lax enforcement provides reasonable defense in depth against CSRF
> >    attacks that rely on unsafe HTTP methods (like "POST"), but do not
>
> s/do not/does not/  ?
>

Indeed!


> >    offer a robust defense against CSRF as a general category of attack:
> >
> >    1.  Attackers can still pop up new windows or trigger top-level
> >        navigations in order to create a "same-site" request (as
> >        described in section 2.1), which is only a speedbump along the
> >        road to exploitation.
> >
> >    2.  Features like "<link rel='prerender'>" [prerendering] can be
> >        exploited to create "same-site" requests without the risk of user
> >        detection.
> >
> >    When possible, developers should use a session management mechanism
> >    such as that described in Section 5.2 to mitigate the risk of CSRF
> >    more completely.
>
> "Strict" enforcement is not explicitly defined?
>
> hm, i guess it is defined in S 3.2 -- perhaps add a x-ref to that?
>

Clarified this in the first sentence (followed by a link to 5.2).

> 4.2.  Monkey-patching the Storage Model
> >
> >    Note: There's got to be a better way to specify this.  Until I figure
> >    out what that is, monkey-patching!
> >
> >    Alter Section 5.3 of [RFC6265] as follows:
> >
> >    1.  Add "samesite-flag" to the list of fields stored for each cookie.
>
> s/Add/Add (in the first paragraph)/  -- the list of fields is in the first
> parag of S 5.3
>

Poked at this.


> >        This field's value is one of "None", "Strict", or "Lax".
>
> i'd prefix the above sentence with 'Note: '
>

Sure. I think we could be a bit clearer with 5.3's definition entirely if
we gave a normative description of possible values/types for each of the
flags, but as-is, a note is reasonable.


> >    2.  Before step 11 of the current algorithm, add the following:
>
> the below are to be new top-level algorithm steps?
>

These would be new steps 11 and 12. The current step 11 would be bumped
down to 13, and subsequent steps would follow along as 14, 15, etc.


> >        1.  If the "cookie-attribute-list" contains an attribute with an
> >            "attribute-name" of "SameSite", set the cookie's "samesite-
> >            flag" to "attribute-value" ("Strict" or "Lax").  Otherwise,
> >            set the cookie's "samesite-flag" to "None".
> >
> >        2.  If the cookie's "samesite-flag" is not "None", and the
> >            request which generated the cookie's client's "site for
> >            cookies"
>
> the term "cookie's client('s)" is used only here (and also isn't defined in
> RFC6265) -- what is it's definition?
>

It's actually referring to "request's client" from FETCH.
"request-which-generated-the-cookie's client". This would, I'm sure, be
simpler in German. :)

>                    is not an exact match for "request-uri"'s host's
> >            registrable domain, then abort these steps and ignore the
> >            newly created cookie entirely.
>
> so the above step applies whether the enforcement policy is lax or strict?
>

Correct. The intent is to store cookies with the SameSite attribute only
when set in a same-site context.


> > 4.3.  Monkey-patching the "Cookie" header
> >
> >    Note: There's got to be a better way to specify this.  Until I figure
> >    out what that is, monkey-patching!
> >
> >    Alter Section 5.4 of [RFC6265] as follows:
> >
> >    1.  Add the following requirement to the list in step 1:
>
> to the end of the existing bullet list in [RFC6265] ?
>

Yes. I've clarified this.


> >        *  If the cookie's "samesite-flag" is not "None", and the HTTP
> >           request is cross-site (as defined in Section 2.1 then exclude
>                                                             ^
>                                                             )
>

Done.


> > 5.  Authoring Considerations
>
> this section is intended to be added to 6265bis?
>

That's up to the group. I think it makes sense to describe how we intend
for developers to actually use this thing. It certainly makes sense in the
context of this document, and I think it probably also makes sense as a new
section in 6265bis that describes authoring considerations with this and
other attributes.


> >    Developers can avoid this confusion by adopting a session management
> >    system that relies on not one, but two cookies: one conceptualy
>
> s/conceptualy/conceptually/
>

Thanks!


> >    granting "read" access, another granting "write" access.  The latter
> >    could be marked as "SameSite", and its absence would provide a
>
> s/provide/prompt/ ?
>

Yes.


>
> >    reauthentication step before executing any non-idempotent action.
> >    The former could drop the "SameSite" attribute entirely, or choose
> >    the "Lax" version of enforcement, in order to allow users access to
> >    data via top-level navigation.
>
> so the above is based on the sites using cookie-based ambient authn (yes?),
> which is a reasonable assumption I suppose but perhaps should be stated
> explicitly?
>

I added "cookie-based" to the first paragraph of the section, which I hope
helps clarify.

> 5.3.  Mashups and Widgets
> >
> >    The "SameSite" attribute is inappropriate for some important use-
> >    cases.  In particular, note that content intended for embedding in a
> >    cross-site contexts (social networking widgets or commenting
> >    services, for instance) will not have access to such cookies.
>
> by "such cookies" you mean same-site cookies ?
>

Clarified.


>
> >                                                               Cross-
> >    site cookies may be required in order to provide seamless
> >    functionality that relies on a user's state.
>
> by "cross-site cookies" you mean cookies lacking the SameSite attribute?
> this
> is the only place this term is used at this time...
>

Reworded.


> >    Likewise, some forms of Single-Sign-On might require authentication
>
> by "authentication" here you mean ambient cookie-based authn ?
>

Yes. Made that explicit.


> >    in a cross-site context; these mechanisms will not function as
> >    intended with same-site cookies.
>
> and thus will need to (continue) to use so-called cross-site cookies ?
>

Yes, but that seems clear from context.


> > 6.  Privacy Considerations
>
> the below are to be added to 6265bis ?


Yes.


> >
> > 6.1.  Server-controlled
> >
> >    Same-site cookies in and of themselves don't do anything to address
> >    the general privacy concerns outlined in Section 7.1 of [RFC6265].
> >    The attribute is set by the server, and serves to mitigate the risk
>         ^
>      SameSite ?
>
> >    of certain kinds of attacks that the server is worried about.  The
> >    user is not involved in this decision. Moreover, a number of side-
> >    channels exist which could allow a server to link distinct requests
> >    even in the absence of cookies.
>
> you mean to say "..could allow a third-party server to link distinct
> top-level
> navigation requests to first-party servers, even in the absence of
> cookies." ?
>

"top-level navigation" is too limiting. Servers receive channel ID/token
binding data for any request, for instance, not just navigational requests.


> >                                     Connection and/or socket pooling,
> >    Token Binding, and Channel ID all offer explicit methods of
> >    identification that servers could take advantage of.
>
> I'm not sure the above sentence is applicable to the third-party cookie
> discussion in RFC6265 S 7.1. My understanding is that Token Binding and
> ChannelID present server-specific identifiers and cannot be used by a
> third-party server to track the UA's top-level navigation requests to
> various
> first-party servers.
>

Both these mechanism send origin-specific data to a server as part of a
request. They have similar properties to cookies, in that they can be used
to correlate requests in a top-level or nested or subresource context.
They're not only sent along with top-level requests, and their privacy
concerns aren't limited to those top-level contexts (e.g. `google.com` in a
frame gets a channel ID as part of the TLS handshake. The same ID, in fact,
that it receives in a top-level context).


> > 6.2.  Pervasive Monitoring
> >
> >    As outlined in [RFC7258], pervasive monitoring is an attack.  Cookies
> >    play a large part in enabling such monitoring, as they are
> >    responsible for maintaining state in HTTP connections.  We considered
> >    restricting same-site cookies to secure contexts [secure-contexts] as
> >    a mitigation but decided against doing so, as this feature should
>
> by "this feature" you mean same-site cookies ?
>

Correct.


>
> >    result in a strict reduction in the number of cookies floating around
> >    in cross-site contexts. That is, even if "http://not-example.com"
> >    embeds a resource from "http://example.com/", that resource will not
> >    be "same-site", and "http://example.com"'s cookies simply cannot be
> >    used to correlate user behavior across distinct origins.
> >
> > 7.  References
>
>
> <snip/>
>
> >    [app-isolation]
> >               Chen, E., Bau, J., Reis, C., Barth, A., and C. Jackson,
> >               "App Isolation - Get the Security of Multiple Browsers
> >               with Just One", n.d.,
> >               <http://www.collinjackson.com/research/papers/
> >               appisolation.pdf>.
>
> the date of this paper appears to be 2011..
>
> CCS’11, October 17–21, 2011, Chicago, Illinois, USA.
> Copyright 2011 ACM 978-1-4503-0948-6/11/10
>
>
> end
>
>
>
Thank you, this was super-helpful. :)

I've made a number of changes in
https://github.com/httpwg/http-extensions/commit/263db30abb6f0f0b5a0cc3c2c3424108b3c171bc,
which I hope addresses the above. I'll make more changes to the various
algorithms in patches to HTML, Fetch, URL, et al.

-mike

Received on Tuesday, 21 June 2016 12:41:20 UTC