[whatwg/url] Expand which hostnames are considered IPv4 addresses (Issue #679)

Follow-up to #560 

I wonder if it would be possible to broaden the linked change to all hostnames whose final label begins with an ASCII digit. The reason is that this would very nicely match RFC-2396 from the IETF:

> Hostnames take the form described in Section 3 of [RFC1034] and
   Section 2.1 of [RFC1123]: a sequence of domain labels separated by
   ".", each domain label starting and ending with an alphanumeric
   character and possibly also containing "-" characters.  **The rightmost
   domain label of a fully qualified domain name will never start with a
   digit, thus syntactically distinguishing domain names from IPv4
   addresses**

https://datatracker.ietf.org/doc/html/rfc2396#section-3.2.2

This kind of alignment is valuable for interoperability with older standards. As more software starts to use the WHATWG standard, there is more opportunity for mismatches when some subsystems use older standards, and that can be an opportunity for SSRF attacks. So understanding the differences between the standards (and minimising them where practical/possible) is really important. 

If we could broaden the change in this way, it would mean we can say with confidence that both standards agree about which hostnames are domains vs. which are IPv4. Of course, we accept more than just dotted-decimal IPv4, but we'd at least agree about what the hostname is supposed to mean. The IETF has been promising since the late 90s that TLDs won't ever begin with a digit, so it seems... maybe safe?

This change would make the "ends-in-a-number" checker a superset of its current implementation (more domains would be considered IPv4, nothing which is currently IPv4 would be considered a domain). It's also slightly computationally cheaper. It means that URLs like `http://hello.0a` would fail to parse, rather than being valid as they are today.

-- 
Reply to this email directly or view it on GitHub:
https://github.com/whatwg/url/issues/679
You are receiving this because you are subscribed to this thread.

Message ID: <whatwg/url/issues/679@github.com>

Received on Monday, 3 January 2022 05:25:28 UTC