Re: [csswg-drafts] [css-text] Should zero width space break Arabic shaping? (#3861)

cc @roozbehp again, since I think he can articulate the reasoning behind this best.

> Does it make sense to add a word-break without affecting shaping? That sounds counter intuitive to me.

My understanding is that ZWSP is inherently a line-break control character.  So it's misleadingly named at best.  It's closer to SOFT HYPHEN, than to a space.  The difference from SOFT HYPHEN seems to be that this one is not expected to turn into a hyphen if line break does happen.  I think the original use case was to be used with scripts that don't use inter-word spaces, to mark line break opportunities.  It feels to me that ZWSP and SOFT HYPHEN should have been one character, to mark "line break allowed".

Other example of ZWSP is to mark break opportunities / word boundaries in concatenated words like long URLs or hash tags like "justanotherawesomelylongurl".  The idea is that whether or not you mark a location as line-break opportunity should be *separate* from whether Arabic shaping happens.  So, if Arabic shaping is not desired, one should use a ZWNJ to control that, separately from ZWSP.

Another explanation is that many characters, like ZWSP, only affect one aspect of Unicode processing and are ignored for all other processes.  This is done to manage complexity.  Such that instead of having to specify behavior of each control / format character on every process on every script, this can be specified independent of scripts for the most part.

Anyway.  Just my understanding.  As I said, I was also surprised by this, but I understand what the rationale / thinking has been.

-- 
GitHub Notification of comment by behdad
Please view or discuss this issue at https://github.com/w3c/csswg-drafts/issues/3861#issuecomment-485791090 using your GitHub account

Received on Tuesday, 23 April 2019 12:56:34 UTC