Re: [whatwg/url] Unicode normalization could change the structure of a URL (#626)

Any kind of Unicode normalization that changes the scalar values of the URL string might change its meaning. A robust setup would eagerly parse a URL string and then serialize the resulting URL record before doing any kind of normalization. At that point the normalization would no-op as a serialized URL is pure ASCII.

I don't think this is in conflict with the recommendation from the i18n WG, but there are some tricky nuances. (As in, if you have a URL string whereby applying NFC would change it and its meaning, you probably better not share it in that form.)

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/whatwg/url/issues/626#issuecomment-892852120

Received on Wednesday, 4 August 2021 17:50:14 UTC