[whatwg/url] Malformed URL Normalization in Standard Introduces SSRF Risks (Issue #893)

HackingRepo created an issue (whatwg/url#893)

### What is the issue with the URL Standard?

### Problem
The current URL Standard mandates that malformed inputs such as `https:foo` or `http:127.0.0.1/` be normalized into valid URLs. This behavior is correct per spec and ensures interoperability across browsers. However, in non‑browser contexts (e.g. server‑side libraries, WAFs, SSRF filters), this normalization introduces **hidden security bypasses on https://github.com/swisskyrepo/PayloadsAllTheThings/pull/809**.

For example:
- A filter may reject `http:127.0.0.1/` as malformed.
- The parser then normalizes it into `http://127.0.0.1/`.
- The request succeeds, bypassing the filter.

This mismatch between “invalid” at the filter level and “valid” at the parser level creates exploitable gaps.

---

### Security Impact
- **SSRF filters**: Attackers can reach internal services by submitting malformed schemes that are later corrected.
- **WAF rules**: Defenders lose visibility because logs show the corrected input, not the original.
- **Cross‑ecosystem risk**: Node.js (`new URL`), Rust (`url` crate`), Python (`urllib`), and other libraries all follow the spec, so the bypass is portable across languages.

---

### Proposal
Introduce a **strict parsing mode** in the URL Standard:
- In strict mode, malformed inputs must be rejected outright.
- No implicit corrections should occur.
- This mode would be opt‑in for non‑browser contexts (servers, libraries, security tools).

Additionally:
- Add a **security note** to the spec warning developers that spec‑compliant normalization is unsafe for validation.
- Encourage libraries to expose both “spec mode” (for browser compatibility) and “strict mode” (for security‑sensitive contexts).

---

### Motivation
- **Balance interoperability and security**: Browsers need consistent parsing, but servers and filters need strict validation.
- **Developer clarity**: Many developers assume `new URL()` or equivalent is safe for validation. Explicit strict mode avoids this pitfall.
- **Precedent**: Other standards (HTML parsing) already distinguish between “quirks mode” and “standards mode.” A strict URL mode would follow this model.

---

### Discussion Points
- Should strict mode be opt‑in or opt‑out?
- Which malformed cases should be rejected (schemes, hosts, paths)?
- How can libraries expose strict mode without breaking existing code?

---


This proposal aims to make the URL Standard safer for security‑sensitive contexts while preserving interoperability for browsers.


-- 
Reply to this email directly or view it on GitHub:
https://github.com/whatwg/url/issues/893
You are receiving this because you are subscribed to this thread.

Message ID: <whatwg/url/issues/893@github.com>

Received on Saturday, 3 January 2026 19:50:57 UTC