Re: [whatwg/fetch] [Feature request] Add a `CookieStore` option to `Request` & `Response` (Issue #1384)

This feature request could be generalized somewhat. The fundamental problem is that server-side HTTP clients need a way to get cookies from request A and send those cookies back to the server in request B.  Adding a `CookieStore` option is one way to solve this (perhaps the best way?) but if Node offers a different (but still ergonomic and secure) way to propagate cookies, that may be OK.

The rest of this comment provides more explanation and justification for this feature requests.

### Server-Side Cookies Use Cases

Copying from https://github.com/nodejs/node/pull/41749#issuecomment-1025440856:


> Use cases for simulating browsers exist on a continuum. On one end is automated testing of browser apps and other cases where you want the client to work _exactly_ like a real browser. For these cases, heavyweight tools like puppeteer are great because they run JS and really act like a browser.
>
> On the other end of the continuum are simpler cases where you don't care if the client works exactly like a browser, you just need it to work _enough_ like a browser that the server won't reject requests. These cases are common when extracting data from a website that doesn't offer an API, aka screen scraping. You can also use heavyweight tools for this case too, but in my experience plain HTTP requests are preferred because they're much faster, use much less RAM, place fewer demands on remote servers (e.g. won't download scripts and CSS), and have fewer dependencies & moving parts that can fail.
>
> Screen-scraping seems like it should be unusual, but in my experience there are quite a few production apps (and many more internal apps or side projects) which sadly need to interact with remote servers that only have a browser interface, not an API. Sometimes it's because the remote server is a legacy system that can't easily be updated, which is common in government, banking, telecom, or honestly almost anything in non-tech companies with limited IT resources. Sometimes it's because you're trying to automate access to a remote system whose owners don't want you to automate it, e.g. competitors. Sometimes it's because getting API access requires jumping through so many hoops that it's not worth the hassle. But it's frustratingly common, and it'd be great to be able to handle these cases without having to reach for a scraping library when a cookie-jar-enabled HTTP client could be all that's needed.

Here's another discussion of use cases from https://github.com/nodejs/node/pull/41749#issuecomment-1025311795:


> It’s a common use case to want to request data from an API that requires authentication, where that authentication check takes the form of checking for a cookie. It’s not only done when you’re trying to simulate a browser.

### Workarounds

Below is what server apps must do to enable cookie propagation between requests. There may be more steps, but the two below are the ones I know about.  Note that redirect requests require different handling to support cases where the cookie is set by a redirect response.

1. Read and parse the Cookie header after each response, and then manually set the Cookie header of subsequent requests. Libraries like [tough-cookie](https://github.com/salesforce/tough-cookie) can help with the parsing and emitting.
2. Ensure that the `redirect` option is set to [`"manual"`](https://developer.mozilla.org/en-US/docs/Web/API/fetch#:~:text=manual%3A%20Caller%20intends,for%20more%20information.) (not the default `"follow"`). Then manually handle all redirection in userland code, including setting the Cookie header as described in (1). I'm not sure if there's an existing library that will handle this redirect logic in the same way that tough-cookie will handle (1). Note that, unlike in the browser, responses using `"manual"` will provide callers access to the headers, per https://github.com/nodejs/node/pull/41749#issuecomment-1025473227. (Manual redirect responses in the browser are opaque and don't provide readable headers to userland code, as discussed in https://github.com/nodejs/node/pull/41749#issuecomment-1025468187)

### Justification

Both (1) and (2) above are non-trivially complex. Both cookies and redirection are fraught with potential security issues and the manual steps described above are quite un-ergonomic.

So, realistically, I'd expect that most cookie-needing Node apps will rely on an (as yet unwritten?) library that wraps `fetch` and handles cookies and redirects. Hopefully this library will not have security or perf problems! 

### More Discussion

I don't know enough about the security implications to have an opinion about whether cookie control as recommended in the OP is needed for `fetch` in browsers. It's possible that it may be a really bad idea in the browser for security reasons. But on the server there does need to be a way to propagate cookies from one request to later requests, because a single global cookie jar is not acceptable for server-side use.



-- 
Reply to this email directly or view it on GitHub:
https://github.com/whatwg/fetch/issues/1384#issuecomment-1029487351

You are receiving this because you are subscribed to this thread.

Message ID: <whatwg/fetch/issues/1384/1029487351@github.com>

Received on Thursday, 3 February 2022 23:13:54 UTC