- From: Steven Bingler <bingler@chromium.org>
- Date: Wed, 1 Oct 2025 12:09:33 -0400
- To: "Dale R. Worley" <worley@ariadne.com>
- Cc: draft-ietf-httpbis-rfc6265bis.all@ietf.org, ietf-http-wg@w3.org, last-call@ietf.org, gen-art@ietf.org
Hi Dale, > (Was gen-art included as an addressee?) It was not. My mistake. You can find all the changes in the latest draft, https://datatracker.ietf.org/doc/html/draft-ietf-httpbis-rfc6265bis, plus a recent small commit: https://github.com/httpwg/http-extensions/commit/59f5e21e84e19ba81eae5484a22a2ed8dec9f445 > It's also surprising that a null extension-av is allowed, although I > don't think it is harmful. That's surprising to me as well now that you've pointed it out. I believe this is a long lived mistake. The -00 draft, https://datatracker.ietf.org/doc/html/draft-ietf-httpbis-rfc6265bis-00#section-4.1.1, specifies that: extension-av = <any CHAR except CTLs or ";"> which I interpret as requiring one or more CHARs. This changed in -01 to: extension-av = *av-octet Meaning that your suggestion > extension-av = 1*av-octet has likely been the intended grammar. We'll be suddenly deprecating null extension-avs, but I doubt that will be an issue. > ; neither the first nor last CHAR is %x20 I think this comment/note is more appropriate below. All the existing comments within the syntax exist to explain the grammar. I've added a note explaining not to use leading or trailing WSP. > But if we do accept > octets past 0x7F, what characters do they represent and how? They're effectively opaque, meaningless, UAs are not instructed to interpret anything beyond 0x7F. > It might be worth stating that *generating* cookies containing octets > beyond 0x7F is deprecated. The well behaved server syntax, Section 4.1.1, already disallows octets above 0x7E. > Or is the long-term concept that 0x80 to 0xFF are expected to be > supported indefinitely I would personally like to see UA support for 0x80-0xFF removed but I don't expect it to occur any time soon. > It seems to me that it might be worth stating that somewhere > toward the top of the document, to provide the reader with context for > the specific rules that follow. I've added a note within the "Overview" section near the top. https://datatracker.ietf.org/doc/html/draft-ietf-httpbis-rfc6265bis-21#name-overview > But I > think you are stating that the current practice is not consistent > regarding allowed characters in different contexts, and so the > desirability of a consistent character set can't be accommodated. Correct. - Steven - Steven On Tue, Sep 23, 2025 at 1:16 AM Dale R. Worley <worley@ariadne.com> wrote: > > My apologies on the delay in responding to this. (Was gen-art included > as an addressee?) > > A large number of items where we are in agreement are omitted here. > > > From: Steven Bingler <bingler@chromium.org> > > Date: Mon, 25 Aug 2025 15:46:09 -0400 > > To: ietf-http-wg@w3.org > > Subject: Re: Genart last call review of draft-ietf-httpbis-rfc6265bis-19 > > Archived-At: <https://www.w3.org/mid/CAKvzGWfJMhX24droK1A20TCkC0BOzeMt8htnSuUd++Snx-j-9g@mail.gmail.com> > > > > Not entirely sure why, but everything past the link got cut off. > > Re-sending my response below: > > > > Hello Dale and Gen-ART, > > > > Thank you for the review and my apologies for the late reply. I had to > > take a hiatus. > > > > Find my responses below. The changes I've made have not yet been > > published as a draft but are available to view on the github repo. > > https://github.com/httpwg/http-extensions/blob/main/draft-ietf-httpbis-rfc6265bis.md > > > > > This is ambiguous for parsing extension-av. E.g. > > > > > > Set-Cookie: name=value;attr1= v a l u e ;attr2=x > > > > > > Does the value of attr1 start with "v" or with " "? Does it end with > > "e" or with " "? > > > > Given RFC9110 5.6.3 indicates that BWS MUST be removed before > > interpreting the protocol element it seems to me that Section 4's > > grammar strongly implies that leading and trailing whitespace are not > > allowed. I agree that being more explicit about that is a good idea. > > > > The "notes" section beneath the grammar, while a bit crowded, feels > > like a good place for an advisement along the lines of:: > > "Per the grammar above, cookie-avs MUST NOT contain leading or > > trailing WSP characters as they will be interpreted as BWS and > > removed." > > OK, I hadn't been aware of RFC 9110 sec. 5.6.3. Given its overarching > scope, it's clear that non-null BWS "should not" appear and that any > recipient of it MUST act as if it was not there. And any implementer > of Cookies SHOULD be aware of those rules. > > However, it's not clear to me how 5.6.3 applies to extension-av > specifically. And my analysis in my report probably was incorrect. The > difficulty appears only in regard to extension-av; all defined options > have unambiguous grammars. cookie-av appears in the context: > > set-cookie-string = BWS cookie-pair *( BWS ";" OWS cookie-av ) > > that is, preceded by OWS and followed by BWS (if there is a following > option). cookie-av has extension-av as an alternative, which is > defined as: > > extension-av = *av-octet > av-octet = %x20-3A / %x3C-7E > ; any CHAR except CTLs or ";" > > The problem is that SPC can be both the first and last character of > extension-av, where it cannot be distinguished from being part of a > preceding OWS or a following BWS. (There isn't a problem determining > unambiguously that an internal SPC is part of the extension-av; I was > mistaken in that.) The BWS would have to be ignored of course, and if I > understand correctly, the OWS is not significant. > > It's also surprising that a null extension-av is allowed, although I > don't think it is harmful. > > I think a good solution would be just > > extension-av = *av-octet > ; neither the first nor last CHAR is %x20 > > or better > > extension-av = 1*av-octet > ; neither the first nor last CHAR is %x20 > > which would be clearer than the BNF to specify that. > > > > 5.2.2. Worker-based requests > > > > That note is out of date and can be removed, we do allow for > > cross-origin workers using data: urls. > > > > > What is the universe of characters? > > > > Characters are to be treated as individual octets that align with > > ASCII. I've added a new note specifying that. > > > > > Unfortunately, this question is significant. For example in sec. 5.6 is > > > > We actually do want UAs to accept octets past 0x7F (for better or > > worse). Disallowing these octets is very likely a breaking change > > and would need to be more carefully rolled out. > > Yeah, we have to be upward-compatible with reality. But if we do accept > octets past 0x7F, what characters do they represent and how? > "Characters are to be treated as individual octets that align with > ASCII." no longer suffices. > > It might be worth stating that *generating* cookies containing octets > beyond 0x7F is deprecated. That would prepare for such a change later. > Or is the long-term concept that 0x80 to 0xFF are expected to be > supported indefinitely (either as support for 8-bit character sets like > Latin-1 or for UTF-8)? ... > > But this again gets back to the question of what the universe of > characters is. It seems that in practice, and practice we > want/need/must support indefinitely, is octets, minus %x00-08 / %x0A-1F > / %x7F but in some places restricted to a smaller set. Though > "characters" may not be the right word ... probably "character > encodings", since a header is actually a sequence of character > encodings. It seems to me that it might be worth stating that somewhere > toward the top of the document, to provide the reader with context for > the specific rules that follow. > > > > and the various algorithms made consistent with it. For example in > > > sec. 5.7, there are explicit limitations on the characters in cookie > > > names, cookie values, and domain attributes, but not for some other > > > parts of the Value. > > > > I'm not sure I follow. Are you referring to, for example, the > > limitation in 5.7 Step 3, summarized as: "If the cookie name or value > > contains a CTL character drop the cookie"? > > My point was that all parts of all values "should" be subject to the > same rules regarding what "characters" are and are not allowed. But I > think you are stating that the current practice is not consistent > regarding allowed characters in different contexts, and so the > desirability of a consistent character set can't be accommodated. > > > > 6.2. Application Programming Interfaces > > > Is "esoteric" the correct word? It seems that "complex" is more > > > correct. > > > > Esoteric feels like it does apply, it is likely only understood by a > > few, maybe "esoteric handling" would be a better phrasing, to group > > the syntax/semantic complexities together. This part was inherited > > from RFC6265 and I tend to enjoy the doc's occasional obscure word > > choices (such as "infelicities"). > > This suggests the meta-question, "Is the use of 'infelicities' an > infelicity?" > > Dale > >
Received on Wednesday, 1 October 2025 16:09:47 UTC