Re: [i18n-activity] Trailing position of metadata (#1398)

I think it is reasonable to separate language and directional metadata here. 

Language tags can be quite long and even the shortest language tag, when encoded using the Unicode language tag characters, would require 16-bytes to encode (start tag, alpha2 primary language, cancel tag) in UTF-8. Since the tag characters probably should be removed before displaying or processing the string, a trailing position might be cleaner (it's easier to truncate a string than substringing it from the front). 

Either way, adding tag characters produces problems for string concatenation and other string operations. And naive implementations that don't process the field can display tofu or garbage as if it were part of the data. Overall, using in-string metadata is a bad idea.

When talking about direction, I think it is helpful to separate bidi controls from metadata. Including LRM/RLM or a LRI/RLI/FSI + PDI enclosing sequence is, to my mind, "altering the contents" of the string to help it display correctly. Processes such as truncation (particularly with the paired controls!) or additional attempts to produce a display-ready sequence alters the meaning and display of the content. These arguments are not new: we talk exhaustively about this in String-Meta as reasons why *not* to use this as a way of communicating direction.

To me, bidi metadata should instead be explicit, which includes not using invisible controls to convey the value. A field like `direction` with values such as `ltr` and `rtl` is a better choice by far. 

Overall, it would have been better if, given that webAuthn could not/would not introduce additional fields, they had adopted a serialization scheme using ASCII characters that was unambiguous and machine readable. I note that the [RDF solution found in JSON-LD](https://www.w3.org/TR/json-ld/#the-i18n-namespace) does this pretty well. Amusingly, the example given there uses 16 bytes to encoding an average sized language tag *and* the direction:

> "HTML و CSS: تصميم و إنشاء مواقع الويب"^^i18n:ar-eg_rtl

... but I'd still tend to say that failing to address our comment at all and coming back in v3 to introduce true metadata would have been the better option.

-- 
GitHub Notification of comment by aphillips
Please view or discuss this issue at https://github.com/w3c/i18n-activity/issues/1398#issuecomment-875675827 using your GitHub account


-- 
Sent via github-notify-ml as configured in https://github.com/w3c/github-notify-ml-config

Received on Wednesday, 7 July 2021 15:02:57 UTC