Re: [whatwg/url] can't parse urls starting with xn-- (#438)

> what about `xn-ASCII` with no trailing dash? ... Will the updated UTS 46 also produce output for those?

I assume you mean `xn--ASCII` with double hyphen. The difference is that the additional hyphen in `xn--ASCII-` separates the "basic characters" (ASCII) from the actual Punycode encoding, and that is empty in this kind of label, which means that just Punycode-decoding it returns the ASCII part and you have an alternate encoding of the same label. (Punycode does not fail. IDNA2008 fails a round-trip check.)

Without the additional hyphen the "ASCII" substring is not actually ASCII at all but it's all-non-ASCII Punycode.

I don't think that `UTS #46` is missing anything for those.
https://www.unicode.org/reports/tr46/#ProcessingStepConvertValidate

It might be ill-formed Punycode, and the spec says to just record an error for that label. If it's well-formed, then the decoded string is subjected to validation, which in turn might record an error if there is a disallowed character or something else wrong.

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/whatwg/url/issues/438#issuecomment-702375523

Received on Thursday, 1 October 2020 20:21:39 UTC