- From: asmusf via GitHub <sysbot+gh@w3.org>
- Date: Thu, 04 Feb 2016 18:06:12 +0000
- To: public-i18n-archive@w3.org
I respectfully disagree with those scholars, and beating up people is not to be encouraged. For one, in terms of digital text representation, the various positional forms for Arabic (or Mongolian) characters are simply different glyphs; they are selected by the layout engine, and not encoded separately as characters. (Leaving aside the compatibility characters for Arabic that correspond to an earlier attempt and exist as an aid for emulators and other types of code museums). While there is a similarity, that in each case, around the concept of a "letter" there is a set of shapes that this letter can take on, "casing" represents of a subset: a bi-cameral script, as the name says, has two sets of forms for each letter, and the choice of form is not one of typography but of orthography, with conventions when to use each one that are based on the content of the text and the intent of the author. In contrast, the positional forms for cursively connected (and similar) scripts are determined solely (or primarily) by the nature of the adjacent letters. Also, the description in section 2.1 conforms to the definition of casing found elsewhere, e.g. in the Unicode Standard, and there's little to be gained to suddenly pretend that the term encompasses scripts that are not bi-cameral (but nevertheless have multiple shapes for the same letters). Finally, case folding requires that there be multiple code points for the same letter and that ignoring that distinction is a common process (Hiragana and Katakana are an example of two sets of shapes for the same sound values, which are not customarily folded, even though all users know which two form the set for the given sound). -- GitHub Notification of comment by asmusf Please view or discuss this issue at https://github.com/w3c/charmod-norm/issues/67#issuecomment-179972047 using your GitHub account
Received on Thursday, 4 February 2016 18:06:14 UTC