- From: Marcos Del Sol Vives <notifications@github.com>
- Date: Thu, 28 Nov 2024 01:44:54 -0800
- To: whatwg/url <url@noreply.github.com>
- Cc: Subscribed <subscribed@noreply.github.com>
- Message-ID: <whatwg/url/issues/821/2505678469@github.com>
@hsivonen You are right. Reading again the UTS 46 indeed requires a validation step that is not affected by the IgnoreInvalidPunycode, so even with that flag set parsing said domain would still fail. Chromium, in the URL parsing, seems to be just leaving fully ASCII hostnames as is (https://github.com/chromium/chromium/blob/9df64a975a05e623c6f53e2e2a1936226b8dc42e/url/url_canon_host.cc#L467-L476 and https://github.com/chromium/chromium/blob/451e794a3a3abc8d999c4682da559ce1885af849/net/dns/dns_config_service_win.cc#L363-L373, for example) When decoding, it keeps invalid IDNA as-is. And by invalid, I mean all non-conforming hostnames. For example: - `xn--espaa-rta.orca.pet`, which is the hostname with proper NFKC normalization, is displayed on the navbar as `espaƱa.orca.pet`. - `xn--espana-0xd.orca.pet`, which uses an invalid NFD normalization, is displayed as the original ASCII string, keeping the domain accessible while making homoglyph attacks impossible. ![imagen](https://github.com/user-attachments/assets/884092e9-7ccd-4763-aa2f-eb6670bb3c5c) In my humble opinion, this is a perfect solution, as every single RFC 1034-conforming host is accesible, while making homoglyph attacks impossible (which is ultimately I think the whole point of validation) -- Reply to this email directly or view it on GitHub: https://github.com/whatwg/url/issues/821#issuecomment-2505678469 You are receiving this because you are subscribed to this thread. Message ID: <whatwg/url/issues/821/2505678469@github.com>
Received on Thursday, 28 November 2024 09:44:57 UTC