Re: Genart last call review of draft-ietf-httpbis-rfc6265bis-19 from Dale R. Worley on 2025-09-20 (ietf-http-wg@w3.org from July to September 2025)

From: Dale R. Worley <worley@ariadne.com>
Date: Sat, 20 Sep 2025 15:38:20 -0400
To: ietf-http-wg@w3.org, <gen-art@ietf.org>
Message-ID: <87a52ox6c3.fsf@hobgoblin.ariadne.com>
My apologies on the delay in responding to this.  (Was gen-art included
as an addressee?)

A large number of items where we are in agreement are omitted here.

> From: Steven Bingler <bingler@chromium.org>
> Date: Mon, 25 Aug 2025 15:46:09 -0400
> To: ietf-http-wg@w3.org
> Subject: Re: Genart last call review of draft-ietf-httpbis-rfc6265bis-19
> Archived-At: <https://www.w3.org/mid/CAKvzGWfJMhX24droK1A20TCkC0BOzeMt8htnSuUd++Snx-j-9g@mail.gmail.com>
> 
> Not entirely sure why, but everything past the link got cut off.
> Re-sending my response below:
> 
> Hello Dale and Gen-ART,
> 
> Thank you for the review and my apologies for the late reply. I had to
> take a hiatus.
> 
> Find my responses below. The changes I've made have not yet been
> published as a draft but are available to view on the github repo.
> https://github.com/httpwg/http-extensions/blob/main/draft-ietf-httpbis-rfc6265bis.md
> 
> > This is ambiguous for parsing extension-av.  E.g.
> >
> > Set-Cookie: name=value;attr1= v a l u e ;attr2=x
> >
> > Does the value of attr1 start with "v" or with " "?  Does it end with
> "e" or with " "?
> 
> Given RFC9110 5.6.3 indicates that BWS MUST be removed before
> interpreting the protocol element it seems to me that Section 4's
> grammar strongly implies that leading and trailing whitespace are not
> allowed. I agree that being more explicit about that is a good idea.
> 
> The "notes" section beneath the grammar, while a bit crowded, feels
> like a good place for an advisement along the lines of::
> "Per the grammar above, cookie-avs MUST NOT contain leading or
> trailing WSP characters as they will be interpreted as BWS and
> removed."

OK, I hadn't been aware of RFC 9110 sec. 5.6.3.  Given its overarching
scope, it's clear that non-null BWS "should not" appear and that any
recipient of it MUST act as if it was not there.  And any implementer
of Cookies SHOULD be aware of those rules.

However, it's not clear to me how 5.6.3 applies to extension-av
specifically.  And my analysis in my report probably was incorrect.  The
difficulty appears only in regard to extension-av; all defined options
have unambiguous grammars.  cookie-av appears in the context:

   set-cookie-string = BWS cookie-pair *( BWS ";" OWS cookie-av )

that is, preceded by OWS and followed by BWS (if there is a following
option).  cookie-av has extension-av as an alternative, which is
defined as:

   extension-av      = *av-octet
   av-octet          = %x20-3A / %x3C-7E
                         ; any CHAR except CTLs or ";"

The problem is that SPC can be both the first and last character of
extension-av, where it cannot be distinguished from being part of a
preceding OWS or a following BWS.  (There isn't a problem determining
unambiguously that an internal SPC is part of the extension-av; I was
mistaken in that.)  The BWS would have to be ignored of course, and if I
understand correctly, the OWS is not significant.

It's also surprising that a null extension-av is allowed, although I
don't think it is harmful.

I think a good solution would be just

   extension-av      = *av-octet
	                 ; neither the first nor last CHAR is %x20

or better

   extension-av      = 1*av-octet
	                 ; neither the first nor last CHAR is %x20

which would be clearer than the BNF to specify that.

> > 5.2.2.  Worker-based requests
> 
> That note is out of date and can be removed, we do allow for
> cross-origin workers using data: urls.
> 
> > What is the universe of characters?
> 
> Characters are to be treated as individual octets that align with
> ASCII. I've added a new note specifying that.
> 
> > Unfortunately, this question is significant.  For example in sec. 5.6 is
>
> We actually do want UAs to accept octets past 0x7F (for better or
> worse). Disallowing these octets is very likely a breaking change
> and would need to be more carefully rolled out.

Yeah, we have to be upward-compatible with reality.  But if we do accept
octets past 0x7F, what characters do they represent and how?
"Characters are to be treated as individual octets that align with
ASCII." no longer suffices.

It might be worth stating that *generating* cookies containing octets
beyond 0x7F is deprecated.  That would prepare for such a change later.
Or is the long-term concept that 0x80 to 0xFF are expected to be
supported indefinitely (either as support for 8-bit character sets like
Latin-1 or for UTF-8)? ...

But this again gets back to the question of what the universe of
characters is.  It seems that in practice, and practice we
want/need/must support indefinitely, is octets, minus %x00-08 / %x0A-1F
/ %x7F but in some places restricted to a smaller set.  Though
"characters" may not be the right word ... probably "character
encodings", since a header is actually a sequence of character
encodings.  It seems to me that it might be worth stating that somewhere
toward the top of the document, to provide the reader with context for
the specific rules that follow.

> > and the various algorithms made consistent with it.  For example in
> > sec. 5.7, there are explicit limitations on the characters in cookie
> > names, cookie values, and domain attributes, but not for some other
> > parts of the Value.
> 
> I'm not sure I follow. Are you referring to, for example, the
> limitation in 5.7 Step 3, summarized as: "If the cookie name or value
> contains a CTL character drop the cookie"?

My point was that all parts of all values "should" be subject to the
same rules regarding what "characters" are and are not allowed.  But I
think you are stating that the current practice is not consistent
regarding allowed characters in different contexts, and so the
desirability of a consistent character set can't be accommodated.

> > 6.2.  Application Programming Interfaces
> > Is "esoteric" the correct word?  It seems that "complex" is more
> > correct.
> 
> Esoteric feels like it does apply, it is likely only understood by a
> few, maybe "esoteric handling" would be a better phrasing, to group
> the syntax/semantic complexities together. This part was inherited
> from RFC6265 and I tend to enjoy the doc's occasional obscure word
> choices (such as "infelicities").

This suggests the meta-question, "Is the use of 'infelicities' an
infelicity?"

Dale
Received on Tuesday, 23 September 2025 05:12:23 UTC