- From: Asmus Freytag <asmusf@ix.netcom.com>
- Date: Thu, 4 Feb 2016 10:15:52 -0800
- To: www-international@w3.org
On 2/4/2016 1:25 AM, Martin J. Dürst wrote: > On 2016/02/04 12:16, klensin via GitHub wrote: >> klensin has just created a new issue for >> https://github.com/w3c/charmod-norm: >> >> == Case Folding introduction (Section 2.1) == >> It may not be relevant (or even, by other measures, correct), but I've >> been beaten up several times by scholars of Arabic calligraphy who >> have claimed by any treatment of the distinction among initial, >> medial, final, and isolated forms as different from the distinction >> between upper, lower (and maybe title) case reflects a European script >> bias and not actual relationships. > > I fully agree with John. I don't have any experience of being beaten > up by experts on that point, but then only because I never even got > the idea to make such a point. > > Regards, Martin. I've responded on the git-hub as follows: I respectfully disagree with those scholars, and beating up people is not to be encouraged. For one, in terms of digital text representation, the various positional forms for Arabic (or Mongolian) characters are simply different glyphs; they are selected by the layout engine, and not encoded separately as characters. (Leaving aside the compatibility characters for Arabic that correspond to an earlier attempt and exist as an aid for emulators and other types of code museums). While there is a similarity, that in each case, around the concept of a "letter" there is a set of shapes that this letter can take on, "casing" represents of a subset: a bi-cameral script, as the name says, has two sets of forms for each letter, and the choice of form is not one of typography but of orthography, with conventions when to use each one that are based on the content of the text and the intent of the author. In contrast, the positional forms for cursively connected (and similar) scripts are determined solely (or primarily) by the nature of the adjacent letters. Also, the description in section 2.1 conforms to the definition of casing found elsewhere, e.g. in the Unicode Standard, and there's little to be gained to suddenly pretend that the term encompasses scripts that are not bi-cameral (but nevertheless have multiple shapes for the same letters). Finally, case folding requires that there be multiple code points for the same letter and that ignoring that distinction is a common process (Hiragana and Katakana are an example of two sets of shapes for the same sound values, which are not customarily folded, even though all users know which two form the set for the given sound). > >
Received on Thursday, 4 February 2016 18:16:23 UTC