Re: [charmod-norm] Case Folding introduction (Section 2.1)

I respectfully disagree with those scholars, and beating up people is 
not to be encouraged.

For one, in terms of digital text representation, the various 
positional forms for Arabic (or Mongolian) characters are simply 
different glyphs; they are selected by the layout engine, and not 
encoded separately as characters. (Leaving aside the compatibility 
characters for Arabic that correspond to an earlier attempt and exist 
as an aid for emulators and other types of code museums).

While there is a similarity, that in each case, around the concept of 
a "letter" there is a set of shapes that this letter can take on, 
"casing" represents of a subset: a bi-cameral script, as the name 
says, has two sets of forms for each letter, and the choice of form is
 not one of typography but of orthography, with conventions when to 
use each one that are based on the content of the text and the intent 
of the author.

In contrast, the positional forms for cursively connected (and 
similar) scripts are determined solely (or primarily) by the nature of
 the adjacent letters.

Also, the description in section 2.1 conforms to the definition of 
casing found elsewhere, e.g. in the Unicode Standard, and there's 
little to be gained to suddenly pretend that the term encompasses 
scripts that are not bi-cameral (but nevertheless have multiple shapes
 for the same letters).

Finally, case folding requires that there be multiple code points for 
the same letter and that ignoring that distinction is a common process
 (Hiragana and Katakana are an example of two sets of shapes for the 
same sound values, which are not customarily folded, even though all 
users know which two form the set for the given sound).

-- 
GitHub Notification of comment by asmusf
Please view or discuss this issue at 
https://github.com/w3c/charmod-norm/issues/67#issuecomment-179972047 
using your GitHub account

Received on Thursday, 4 February 2016 18:06:14 UTC