W3C home > Mailing lists > Public > www-international@w3.org > October to December 2016

[i18n-discuss] Use of ZWJ

From: r12a via GitHub <sysbot+gh@w3.org>
Date: Thu, 10 Nov 2016 17:12:42 +0000
To: www-international@w3.org
Message-ID: <issues.opened-188566286-1478797960-sysbot+gh@w3.org>
r12a has just created a new issue for 
https://github.com/w3c/i18n-discuss:

== Use of ZWJ ==
[Comment by @ntounsi]

What is the spec of the zero-width joiner?
1) Unicode-Ch9, Sec 9.2 Arabic, p371 
(http://www.unicode.org/versions/Unicode9.0.0/ch09.pdf) says:
"... The use of a joiner **adjacent to a suitable** letter permits 
that letter to form a cursive connection without a **visible 
neighbor**." and gives a use case "This provides a simple way to 
encode some special cases, such as exhibiting a connecting form in 
isolation." 

Guess that "suitable letter" means those letters who join according to
 their writing system (e.g. Aleph doesn't join to it's left in 
Arabic).
But what "adjacent" mean? At the right, left or both?
"without a visible neighbor" : what about no visible member next? A 
visible member but separated by space, say? 
To support the definition, Unicode gives an example of a special case 
with letter HEH and shows what ZWJ means for that special case. 

2) Wikipedia informal définition says: 
 "When placed **between** two characters that **would** otherwise not 
be connected, a ZWJ causes them to be printed in their connected 
forms."
More explicit but (only?) "between" two letters. When "would" they be 
not connected.  What are those situations?

Any way, browsers implementation of ZWJ differs from one another. I 
noticed that browsers (based Gecko vs. based Webkit) display depends 
on
- font used (some fonts seem to react better), 
- What letter is neighbour, even if separated by a space 
- base direction rtl or ltr (!?)

Also, letter Ghain behave differently than letter Heh, though they 
both have four different shapes: initial, medial, final and isolated. 
Moreover, Gecko implementation is right for "colored letters" within a
 word (they join). But ZWJ applied to only one letter works better in 
Webkit implementation : all the four shapes show up .

See https://github.com/w3c/i18n-discuss/issues/2
Please do NOT reply to this email. If you'd like to contribute to the 
discussion, please do so at the above link. You will need to subscribe
 yourself to the issue (using the button provided by that page) to 
receive notifications of further comments.
Received on Thursday, 10 November 2016 17:12:48 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 22:41:11 UTC