Re: agenda+ String-Meta authoring guidance

Here's what i would like to contribute as a comment in the thread.

IMO, the string fields don't need to be individually marked with 
direction where two cases apply:

1. resource-wide metadata has been set (eg. at the top of the file) 
indicating the default direction for all strings AND that direction 
applies to the individual string's content (if not, eg. if the 
resource-wide direction is set to RTL but a particular string has an 
overall direction of LTR, then it needs to be labelled).
2. no resource-wide metadata is set AND the specification requires that 
a receiving application must apply strong-first analysis to a string AND 
the string doesn't start with the wrong type of strong character.

If the specification doesn't require the use of strong-first analysis by 
the receiving application, and there is no resource-wide default to fall 
back on, then each string would need to be individually labelled.

@TallTed i think your edit should say that the direction is not required 
for this particular case because... (presumably the application is 
expected to do first-strong analysis and get it right).  It's not that 
these are just optional in all cases. Or better still, you could include 
some strings that are ambiguous (ie. don't resolve correctly via 
first-strong analysis) – the i18n WG may be able to provide you with 

Also, you moved a period to the left of some Arabic text.  This is one 
of the problems with trying to create RTL examples. It now looks as if 
the RTL base direction has been applied to the text for that string in 
the example, although in fact it hasn't - which will become clear if 
someone tries to copy the Arabic text in that example to use someplace 
else (the period will end up in the wrong place if direction is applied 
to the string, either explicitly or by the surrounding context).  
There's no easy answer here (see 
Personally, i'd be inclined to leave the period where it was, and 
perhaps indicate that the Arabic examples show the in-memory order. 
(Fwiw, if you had a string containing 
<arabic-text><latin-text><arabic-text> this problem would be 
significantly increased, because directional runs would look out of 
order (making it difficult to read).


PS: We should probably be having this discussion in the i18n-activity 
tracker issue.

> Addison Phillips <>
> 25 February 2024 at 17:10
> I made this comment: 
> In this issue, the WG wants to remove RTL direction fields from purely 
> RTL Arabic strings “because UTF-8 will get it right”. In responding to 
> this I said:
>   * I disagree. When the Arabic string is inserted in another context,
>     it wants to be bidi isolated. Doing that means setting the base
>     direction for the string. If there is no base direction stored in
>     the record, the consumer has to figure out (by inspecting the
>     string or from the language tag) which string direction to use--or
>     depend on |auto|, if that is an option in the target context.
>     |auto| is an option in HTML, but many UI APIs (Windows, MacOS,
>     Java, etc.) require a specific direction.
> LTR is the default for most applications and most languages, so 
> omitting the |@direction| from left-to-right |en| and |fr| texts isn't 
> a serious disadvantage to those strings. But it's a good idea to 
> always transmit RTL text with an RTL direction. It is the case that 
> purely Arabic strings will work appropriately with |auto|, so I don't 
> disagree with (person)'s observation. But most applications don't have 
> humans evaluating each string for whether the direction is needed or not.
> I wanted to point to authoring guidance in string-meta, but that’s 
> lacking. Let’s discuss whether (a) my response was the appropriate one 
> and (b) either way, how to record durably our recommendations.
> ~Addison
> Addison Phillips
> Chair (W3C Internationalization WG)
> Internationalization is not a feature.
> It is an architecture.

Received on Wednesday, 28 February 2024 13:23:01 UTC