- From: Petr Špaček <notifications@github.com>
- Date: Mon, 17 Feb 2025 03:41:49 -0800
- To: whatwg/url <url@noreply.github.com>
- Cc: Subscribed <subscribed@noreply.github.com>
- Message-ID: <whatwg/url/issues/245/2662868049@github.com>
pspacek left a comment (whatwg/url#245) Hi, a DNS guy here. Allow me to describe this from DNS perspective: > Maybe I'm wrong but aren't valid domains defined in the RFCs below? > > * https://www.ietf.org/rfc/rfc1034.txt > > * https://www.ietf.org/rfc/rfc1123.txt > > > The first one saying: > > <domain> ::= <subdomain> | " " Indeed that is wrong in a subtle way. This quote comes from section [3.5. Preferred name syntax](https://datatracker.ietf.org/doc/html/rfc1034#section-3.5) of RFC 1034 -with emphasis on **preferred**. The real limits of the DNS protocol are made clear here: [11. Name syntax](https://datatracker.ietf.org/doc/html/rfc2181#section-11) in RFC 2181. TL;DR anything goes, including binary 0 (ASCII `NUL`) and `.`. These weird-but-permissible-in-DNS names are then encoded into ASCII strings like `\000\..example.com.` where the leftmost label is consists of two ASCII characters: - `NUL` - `.` - which is a character **inside** the leftmost label, not a label separator We could argue URL should be concerned only with **host names** (as opposed to **domains**) and then the quote might more fitting, but that ignores IDNA completely. RFC5890 defines stricter subset of permissible names in ASCII encoding... I'm happy to discuss further if there's interest! -- Reply to this email directly or view it on GitHub: https://github.com/whatwg/url/issues/245#issuecomment-2662868049 You are receiving this because you are subscribed to this thread. Message ID: <whatwg/url/issues/245/2662868049@github.com>
Received on Monday, 17 February 2025 11:41:53 UTC