Re: [webauthn] Mechanism for encoding *direction* metadata may need more work (#1644)

In an [email reply](https://lists.w3.org/Archives/Member/member-i18n-core/2021Nov/0023.html) to @wseltzer I remarked:

When we last spoke, I took an action item to produce a PR with suggested text. I prepared that PR and you can find the text of it in my fork of [webauthn ](https://aphillips.github.io/webauthn/#sctn-strings-langdir). The only proposed requirement that my design doesn't address is a terminating character or sequence to indicate if truncation has occurred. The serialization suggested was taken from [JSON-LD](https://www.w3.org/TR/json-ld/#the-i18n-namespace ) (in an attempt to avoid a proliferation of serialization schemes in the world).

I brought this proposal to the I18N WG and the feeling of the working group was that we shouldn't be proposing novel methods of encoding language and direction--that instead we should provide guidelines and then let the working group address that with text. The WG feels that separate metadata fields are, of course, preferable, but we understand why that's probably not possible.

In response to @agl's comments above, I could see using the RLM/LRM character instead of the ASCII encoding I propose as it would serve as both a direction indicator and truncation marker. These two characters are just strongly-directional invisible characters and so don't bring any display tampering risks. On the other hand, they have to be mapped, rather than just applying them to fields (that often expect the ASCII sequence).

The reason I suggest using ASCII sequences for separators and language tags is that, due to UTF-8's encoding characteristics, they are the most compact representation. We've previously discussed why postfixing the values is better for constrained storage devices. Any implementing system would have to understand the format and remove the additions (the fallback for older systems of course is to display the sequences as "garbage"). Alternative separators to my proposal of `^^^` would be acceptable. Some 3-byte Unicode code points might be good for this, notably U+FFFC (object replacement character) or perhaps the BOM (U+FEFF).

Ultimately, encoding metadata inside your strings is less satisfying than encoding it in metadata fields or in a data structure meant for the purpose. However, retrofitting such to an existing spec is hard.

-- 
GitHub Notification of comment by aphillips
Please view or discuss this issue at https://github.com/w3c/webauthn/issues/1644#issuecomment-966654233 using your GitHub account


-- 
Sent via github-notify-ml as configured in https://github.com/w3c/github-notify-ml-config

Received on Thursday, 11 November 2021 21:57:07 UTC