Re: Handling Cookies is a Minefield

I don't think that quite captures the problem. The spec is currently
written assuming that these two divisions are the same:

1. Are you a cookie producer (strict) or cookie consumer (permissive)?
2. Are you a server (strict) or a user agent (permissive)?

See 3.2.1 and 3.2.2, but how they forward to 4 and 5, which are servers and
user agents, respectively. This blog post is observing that a server acts
as *both* a cookie producer *and* a cookie consumer. When it sets cookies,
it is a cookie producer. But when it receives them back again, it is a
consumer. This blog post is observing that being strict in this second
consumer case will cause problems, and we actually need to be lax there. (*At
least* lax enough to gracefully parse the request and skip the cookie. You
don't necessarily have to return it to your web application logic if you
don't think the web application will handle it. But you can't 500 the
request if you want to avoid problems.)

(Alternatively, we make the permissive mode also ASCII-only, but the data
seems to suggest this alternative is not viable.)

David

On Wed, Dec 4, 2024 at 4:39 PM Mark Nottingham <mnot@mnot.net> wrote:

> Would it be worthwhile to have a discussion of this situation in the
> document?
>
> E.g., something in the Overview along the lines of
>
> 3.3. Unicode and Cookies
>
> [paraphrasing] The parsing algorithm allows Unicode characters to occur in
> cookies, but the serialization algorithm does not allow their emission.
> This is because non-ASCII characters are not widely interoperable in HTTP
> headers, including cookies; common libraries do not handle them properly
> and intermediaries might not forward them without changes. As a result,
> while Unicode in cookie values might work in controlled or limited
> circumstances, their use is discouraged by this specification.
>
> Cheers,
>
>
> > On 5 Dec 2024, at 4:59 AM, David Benjamin <davidben@chromium.org> wrote:
> >
> > On Tue, Dec 3, 2024 at 11:27 PM Ryan Hamilton <rch@google.com> wrote:
> > On Tue, Dec 3, 2024 at 9:37 AM David Benjamin <davidben@chromium.org>
> wrote:
> >
> > Regardless, I think which spec is where is mostly a distraction. When
> something is ill-defined, fixing the ill-definedness necessarily involves a
> feedback loop between spec and implementation, with changes on both sides,
> until we figure out where to converge. Different communities manage that
> feedback loop differently. The mishmash of specs you see is a symptom of
> all this work not being done.
> >
> > If we had infinite energy, could resolve problems at infinite speed, and
> had infinite bandwidth for coordination, the compatibility needs of the
> HTTP ecosystem (web and non-web) would be perfectly uniform, the IETF
> general-HTTP-level specifications would perfectly match those needs, and
> the web stuff could cleanly layer on top of it, without having to override
> any of it. We do not live in that world, so here we are. But I think
> focusing on the symptom of our limitations doesn't help us move forward.
> How to move forward is to do the work to converge things.
> >
> > This! 100% this! The problem is not a lack of clear specification here
> (though the spec could certainly be improved). The problem is that the
> ecosystem as it currently exists relies on load-bearing, spec-non-compliant
> behavior. Changing those behaviors will break real-world users (as the
> linked paper explained). We can spec as much as we want but until we do the
> work to actually migrate these implementations/users, I suspect we'll be
> stuck.
> >
> > Well, that or update the specification when the breakage for real-world
> users is too great. Steven can speak more authoritatively, but converging
> on an ASCII-only notion of cookies does not look viable.
> >
> > David
>
> --
> Mark Nottingham   https://www.mnot.net/
>
>

Received on Wednesday, 4 December 2024 21:58:15 UTC