W3C home > Mailing lists > Public > public-i18n-archive@w3.org > July to September 2017

Re: [alreq] Missing terms in glossary

From: r12a via GitHub <sysbot+gh@w3.org>
Date: Tue, 18 Jul 2017 11:46:31 +0000
To: public-i18n-archive@w3.org
Message-ID: <issue_comment.created-316039590-1500378389-sysbot+gh@w3.org>
@ntounsi here are some suggested alterations and a couple of additions.  I think it's useful to mention what are and are not combining characters in Unicode (particularly in case anyone with an IDN background is reading this).

**Ijam** : Diacritical marks applied to a basic letter shape (or skeleton) to derive a new letter. For example a dot under a "curve" to get the letter Bah. In Unicode each letter plus ijam combination is encoded as a separate, atomic character.

**Tashkil** :  Marks that are added to letters to  indicate  vocalisation  of  text or  to  correct pronunciation. In Unicode these are all combining characters applied to a base character.

**Harakat** : Tashkil marks representing short vowel sounds.

**Tanwin** : (Derived from Noon). Tashkil marks indicating postnasalized or long vowels at the end of a word, and indicated by doubling the sign of one of the harakat diacritics. 

**Shadda**: A tashkil mark indicating gemination of the base consonant.

**Sukun**: A tashkil mark indicating the lack of a vowel after the consonant to which it is attached.

wdyt?


-- 
GitHub Notification of comment by r12a
Please view or discuss this issue at https://github.com/w3c/alreq/issues/130#issuecomment-316039590 using your GitHub account
Received on Tuesday, 18 July 2017 11:46:39 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 18 July 2017 11:46:40 UTC