Re: Dnsdir telechat review of draft-ietf-httpbis-rfc6265bis-19

Petr,

I changed xn-label to a-label as suggested.

The PR is also now merged, thanks for your assistance.

- Steven

On Tue, Nov 18, 2025 at 4:16 AM Petr Špaček <pspacek@isc.org> wrote:
>
> Hi Steven.
>
> I've responded on Github to point out that Step 2 should read as
> "2. All labels must be one of U-label, A-label, or Non-Reserved LDH
> (NR-LDH)"
>
> See RFC5890 section 2.3.1:
>      ... For
>      IDNA-aware systems, the valid label types are: A-labels, U-labels,
>      and NR-LDH labels. ...
>
>
> Anyway, what I was primarily trying to say in the previous message is
> that the "Domain Matching" algorithm works ONLY when combined with
> "Canonicalized Host Names" algorithm as required by RFC5890 section
> 2.3.1. It will break in weird ways if the input to the current version
> of "Domain Matching" algorithm is not correctly sanitized.
>
> Petr Špaček
>
>
> On 14. 11. 25 18:17, Steven Bingler wrote:
> > Hi Petr,
> >
> > Thanks for taking a look. I'll follow up once the PR is merged and,
> > unless you see another issue in the meantime, I'll consider this
> > review completed.
> >
> >> I'm saying this just to make people aware there would be dragons if
> >> "Canonicalized Host Names" section was less strict.
> >
> > Just to be clear, the current "Canonicalized Host Names" algorithm
> > doesn't have any issues?
> >
> > Thanks,
> > - Steven
> >
> > On Fri, Nov 14, 2025 at 5:53 AM Petr Špaček <pspacek@isc.org> wrote:
> >>
> >> Hi Steven,
> >>
> >> Thank you for the changes. I went through
> >> https://github.com/httpwg/http-extensions/pull/3327
> >> and it seems good to me. I've submitted couple wording nits into the PR.
> >>
> >> I could not break it, under the assumption that only permitted names are
> >> encoded into ASCII-only + the new restrictions added into "Canonicalized
> >> Host Names" section.
> >>
> >>
> >> Side note:
> >>
> >> The algorithm in
> >> ### Domain Matching
> >> is done exactly the opposite way than normal DNS name matching works.
> >>
> >> DNS software first breaks names into labels and then operates on
> >> individual labels, comparing them from most significant to least
> >> significant.
> >>
> >> The algorithm in the document instead constructs one string per full
> >> name and then compares the strings from right to left, which relies on
> >> lexical properties of allowed algorithm inputs to make it work.
> >>
> >> In case of this document this seems to work because "Canonicalized Host
> >> Names" section limits inputs and removes possibility of having "." as
> >> part of the label.
> >>
> >>
> >> FTR a name like this would be allowed by DNS itself:
> >>
> >> label\.with\.three\.dots.example.com
> >>
> >> Text representation above is written using RFC1035 escaping rules.
> >> On the wire it simply is 0x2e octet in the label octet sequence. Each
> >> label is prefixed by label-length field. I.e. labels can contain
> >> arbitrary binary garbage.
> >>
> >> I'm saying this just to make people aware there would be dragons if
> >> "Canonicalized Host Names" section was less strict.
> >>
> >> With that - I wish you smooth progress through publication process!
> >>
> >> Petr Špaček
> >>
> >>
> >>
> >>
> >> On 13. 11. 25 19:15, Steven Bingler wrote:
> >>> Hi Petr,
> >>>
> >>> Thanks for your responses, find my follow ups immediately below which
> >>> are then followed by further comments on your original review.
> >>>
> >>>> With my software developer hat on...
> >>>
> >>> After taking another look don't think Section 4.1.2.3 needs this note.
> >>> Section 4 is defining the well behaved server syntax so including a
> >>> note that bad behavior is accepted seems counterproductive.
> >>>
> >>> We've had a number of discussions in the past regarding the messiness
> >>> of the Section 4 (Server) and Section 5 (User Agent) behavior
> >>> differences and what we have now is the result of trying to clean up
> >>> what we can. It's an idiosyncrasy to be sure but likely one that is
> >>> going to take much more work in both the spec space and real world in
> >>> order to resolve.
> >>>
> >>>> Apologies, that was a typo. I meant RFC 5890 page 10
> >>>
> >>> Thanks for clarifying.
> >>>
> >>>> Perhaps add a sentence like "weird inputs will be rejected because they
> >>>> will not match" or something?
> >>>
> >>> I'm disinclined to add a note since 5.6.3 is only concerned with
> >>> getting the value of the `Domain=` attribute into the expected form.
> >>> It doesn't know or otherwise care about the value itself.  Section 5.7
> >>> Step 10 is what actually examines the value
> >>> ```
> >>>    10 If the domain-attribute is non-empty:
> >>>      1. If the canonicalized request-host does not domain-match the
> >>> domain-attribute:
> >>>        1. Abort this algorithm and ignore the cookie entirely.
> >>> ```
> >>>
> >>>> Step 10: request-host value is canonicalized, but the domain-attribute value is
> >>>> NOT canonicalized here. Is that intentional?
> >>>
> >>> The producing server is expected to provide a domain value that
> >>> matches the request url as processed by the user agent (i.e.: Yes).
> >>>
> >>> While I wasn't around for the original decision I suspect that Unicode
> >>> was not considered at the time, meaning that the `Domain=` value
> >>> simply needed to match the site's domain.
> >>>
> >>>> See above about the difference between producer and consumer grammar.
> >>>
> >>> Ah. Yes, in short the spec advises a well behaved syntax that it
> >>> recommends server adhere to but, in the interest of wider
> >>> compatibility, the spec was resigned to include non-ideal behavior
> >>> that we've seen in the wild. E.x.: Historical behavior, common
> >>> mistakes, etc.
> >>>
> >>> --- Original review comments below ---
> >>>
> >>>> Considering the prevalence of this problem in the HTTP specs, I'm not against
> >>>> keeping the statut quo if authors decide to do so, but I think it should be
> >>>> acknowledged at the beginning of the document.
> >>>
> >>> I've added a new section at the beginning for how name resolution
> >>> should be handled. Namely, only ASCII and ACE is supported and the
> >>> document is written from the perspective of DNS.
> >>>
> >>>>> 2.3. Terminology
> >>>>> Whenever possible, user agents SHOULD use an up-to-date public suffix list,
> >>>>> such as the one maintained by the Mozilla project at [PSL].
> >>>
> >>> Agreed, created a new security section and referenced it from 2.3 instead.
> >>>
> >>>>> 4.1.1. Syntax
> >>>>> The domain-value is a subdomain as defined by [RFC1034], Section 3.5, and as
> >>>> enhanced by [RFC1123], Section 2.1. Thus, domain-value is a string of [USASCII]
> >>>> characters, such as an "A-label" as defined in Section 2.3.2.1 of [RFC5890].
> >>>
> >>>> This might work if we assume the underlying naming system is DNS...
> >>>
> >>> Should be covered by the new name resolution section.
> >>>
> >>>>> 5.1.2. Canonicalized Host Names
> >>>>> A canonicalized host name is the string generated by the following algorithm:
> >>>>>
> >>>>> 1. Convert the host name to a sequence of individual domain name labels.
> >>>>>
> >>>>> 2. Convert each label that is not a Non-Reserved LDH (NR-LDH) label, to an
> >>>> A-label (see Section 2.3.2.1 of [RFC5890] for the former and latter). > > 3.
> >>>> Concatenate the resulting labels, separated by a %x2E (".") character.
> >>>>
> >>>> This algorithm does not handle all possible inputs...
> >>>
> >>> I've edited the algorithm to explicitly work on U-labels, XN-labels,
> >>> and NR-LDH labels and to fail for all other inputs as well as fake
> >>> A-label outputs.
> >>>
> >>>>> 5.1.3. Domain Matching
> >>>
> >>> I've added a note to the top of the algorithm.
> >>>
> >>>>> If the canonicalized request-host does not domain-match the domain-attribute:
> >>>>
> >>>> I would add reference for "domain-match" definition in sec. 5.1.3.
> >>>
> >>> Done.
> >>>
> >>>>> 8.7. Reliance on DNS
> >>>>
> >>>> This is first and only mention of 'DNS' in the text...
> >>>
> >>> This concern should be handled with the new name resolution section.
> >>>
> >>> Finally, please take a look at my proposed changes and let me know if
> >>> you have any comments. You can find the PR here:
> >>> https://github.com/httpwg/http-extensions/pull/3327
> >>>
> >>> Thanks,
> >>> - Steven
> >>>
> >>>
> >>>
> >>> On Sat, Nov 8, 2025 at 6:54 AM Petr Špaček <pspacek@isc.org> wrote:
> >>>>
> >>>> On 13. 10. 25 16:12, Steven Bingler wrote:
> >>>>> Thank you for your thorough review. My apologies for the long delayed
> >>>>> response, I had to take a hiatus.
> >>>>
> >>>> Hello,
> >>>>
> >>>> and I apologize for the delay as well, last weeks were turbulent. Too
> >>>> bad we could not meet at IETF venue, pen and paper might be useful for
> >>>> some of these :-)
> >>>>
> >>>>> I'm still working through the issues that you've highlighted. I'm not
> >>>>> as familiar as I'd like to be with name resolution systems so I have
> >>>>> some further discussions about the issues.
> >>>>
> >>>> Happy to answer any questions (if I know answers...)!
> >>>>
> >>>> Perhaps this older e-mail could serve as an illustration of the problem
> >>>> at hand with different naming systems and their different encoding rules
> >>>> for names:
> >>>>
> >>>> https://mailarchive.ietf.org/arch/msg/last-call/bruydK32zq7pIep1VprdRwujcEo/
> >>>>
> >>>>>>> (Note that a leading %x2E ("."), if present, is ignored even though that
> >>>>> character is not permitted.)
> >>>>>> Should this be mentioned in the 4.1.1. Syntax? This inconsistency makes me
> >>>>>> wince.
> >>>>>
> >>>>> It's my understanding that the `domain-value      = <subdomain>`
> >>>>> syntax already diallows the leading '.', but that for historical
> >>>>> reasons some servers will still produce it, hence the note.
> >>>>
> >>>> With my software developer hat on, it makes me mad that the document
> >>>> lays out a formal grammar and then free form text elsewhere says "ya
> >>>> know, ignore the grammar and do this". It kind of defeats purpose of
> >>>> formal grammar!
> >>>>
> >>>> I think it would lower risk of misunderstandings if grammar itself was
> >>>> absolutely clear. Something along those lines:
> >>>>
> >>>> GENERATOR syntax:
> >>>> domain-av         = "Domain" BWS "=" BWS domain-value
> >>>>
> >>>> CONSUMER syntax:
> >>>> domain-av         = "Domain" BWS "=" BWS [.]domain-value
> >>>>
> >>>> (or some other suitable form)
> >>>>
> >>>> Or just rename the section to 'Syntax for generators' (or producers or
> >>>> whatever term you find descriptive) to make it clear it does not apply
> >>>> to consumers.
> >>>>
> >>>> This ties to my complaint at the end of your reaction - difference
> >>>> between allowed behavior of generator vs. consumer.
> >>>>
> >>>>
> >>>>>>> 5.1.2. Canonicalized Host Names
> >>>>>> This algorithm does not handle all possible inputs.
> >>>>>> Using teminology from RFC 5890 sec. 2.3.1: DNS name (RFC 1035) > LDH host
> >>>>> name (RFC 1123) > R-LDH Label (RFC5890) > XN-label > Fake A-label vs. A-label
> >>>>>
> >>>>> Is the issue here that the current algorithm will, incorrectly,
> >>>>> instruct to convert a reserved LDH label into an (fake) A-label which
> >>>>> is invalid?
> >>>>>
> >>>>>> According to diagram in RFC 5860 page 10,
> >>>>> I can't find the diagram you're referring to.
> >>>>
> >>>> Apologies, that was a typo. I meant RFC 5890 page 10 (the same number as
> >>>> in previous paragraph).
> >>>>
> >>>>
> >>>>>>> 5.6.3. The Domain Attribute
> >>>>>> The preamble of section 5.6
> >>>>> explicitly states weird inputs are to be expected
> >>>>>>> 5.7. Storage Model
> >>>>>
> >>>>> What this algorithm is relying on is that this domain attribute's
> >>>>> value must match up with the request url which would mean that any
> >>>>> "weird" character inputs, "~bla!.example.com", would cause that
> >>>>> matching to fail and the cookie to be discarded.
> >>>>
> >>>> Perhaps add a sentence like "weird inputs will be rejected because they
> >>>> will not match" or something?
> >>>>
> >>>>
> >>>>>>> 5.8.3. Retrieval Algorithm
> >>>>>> Sections 5.7 Storage Model and 5.8 Retrieval Model sort of ignore the role of
> >>>>>> 'generator', i.e. the server which needs to properly form cookies. Perhaps it
> >>>>>> is okay, but it has surprised me. In DNS spec we often have 'server' and
> >>>>>> 'client' parts in the spec, but here we seem to have only 'client'.
> >>>>>
> >>>>> Sorry, I don't follow. Could you rephrase the issue?
> >>>>
> >>>> See above about the difference between producer and consumer grammar.
> >>>> It's an illustration of the problem I had I mind, I guess, but it is
> >>>> half a year ago so I might be misremembering things.
> >>>>
> >>>> --
> >>>> Petr Špaček
> >>
> >>
> >> --
> >> Petr Špaček
>
>
> --
> Petr Špaček

Received on Tuesday, 18 November 2025 18:40:02 UTC