- From: klensin via GitHub <sysbot+gh@w3.org>
- Date: Fri, 30 Sep 2022 00:59:35 +0000
- To: public-i18n-archive@w3.org
It may take me a bit of time to completely review the spec (reasons may have been clear from my reaction on the call today), but I immediately noticed one thing. Part of the text reads" "... encoding non-ASCII characters as %HH escapes..." In reality, %HH escapes died (or became ambiguous) the moment non-ASCII characters on the web stopped being assumed to be coded in ISO 8859-1, where a character was still a single octet. The description of how things are escaped should be folded together with the next sentence, assuming that any escape will represent a sequence of the octets in a UTF-8 string. The comment about email local parts is definitely not right; I'll try to rewrite it in the next few days, but you may not like the results. In particular, the only "scheme" that, AFAIK, is relevant for email addresses is "mailto:". It is defined in RFC 6068, which predates the standards track SMTPUTF8 specs and uses the local-part and domain definitions from RFC 5322. Consequently, as far as "mailto:" is concerned, there is no such thing as non-ASCII characters in the local-part and any IDNs in the domain part have to be (given other specs) transcoded to ASCII using Punycode. Maybe downhill from there: see the fairly extended rant at <https://mailarchive.ietf.org/arch/msg/regext/fiYY81Y7ldmLuWnsHtiZFumx-8M>, supplemented by more ranting at <https://mailarchive.ietf.org/arch/msg/regext/O0TeWi-FP1rouwLtgA2Cqhbtk00>. There are responses from Martin and Asmus later in the thread that disagree with me to which I have not yet responded. -- GitHub Notification of comment by klensin Please view or discuss this issue at https://github.com/w3c/bp-i18n-specdev/pull/80#issuecomment-1262984637 using your GitHub account -- Sent via github-notify-ml as configured in https://github.com/w3c/github-notify-ml-config
Received on Friday, 30 September 2022 00:59:37 UTC