- From: sleevi <notifications@github.com>
- Date: Thu, 12 Dec 2019 18:26:32 -0800
- To: whatwg/url <url@noreply.github.com>
- Cc: Subscribed <subscribed@noreply.github.com>
- Message-ID: <whatwg/url/issues/458/565273240@github.com>
On Thu, Dec 12, 2019 at 4:10 PM achristensen07 <notifications@github.com> wrote: > I have three concerns: > 0. Are there any real registered domains with '*', '^', '|', or '"' in > them? I imagine there are rules somewhere in ICANN preventing this, but it > would be good to reference them. > RFC1034 sets out the LDH rule for preferred name syntax, and the ICANN Registry Agreement (specifically, Spec 6 - Interoperability and Continuity) restricts registerable domains to that. Of course, the world knows that beyond that, madness lies - because DNS wire-form is 8-bit, it can have any form, and even though A/AAAA are “supposed” to follow preferred name syntax (as host records), buggy servers combined with generic client libraries (that support other non-DNS resolution paths) can let anything through. The most obvious case is underscores. So these URLs would appear either in private networks, non-DNS host schemes, or as subdomains of registered names taking advantage of lax client behavior. While you can’t issue TLS certificates to these names directly, you can sneak by with wildcards, sadly. > 1. What led to these characters being forbidden in Gecko? Will we want > to change this set of forbidden characters again after this? > 2. Are there any URLs with custom schemes with those in their host? > This is harder to find out. I hope the compatibility risk is minimal but I > don't have a good way to find out except changing it and seeing which > things break. > > Yeah, this is the analysis I mentioned we’d have to do for Chrome. URL parsing changes are generally accompanied by analyzing corpora like the entire Google search index to see what compatibility risk might be had, and that’s a Lot Of Work compared to changing a few characters in a lookup table to zero. 😕 I think it’s worth doing, and I think it’s worth aligning on. > 1. This is also a nudge to Chrome and Firefox to implement hosts in > URLs with custom schemes according to the spec. > > Yes. It’s well deserved and the biggest issue with our URL parsing 😔 -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/whatwg/url/issues/458#issuecomment-565273240
Received on Friday, 13 December 2019 02:26:34 UTC