W3C home > Mailing lists > Public > www-international@w3.org > January to March 2016

Re: [charmod-norm] Case Folding introduction (Section 2.1)

From: Martin J. Dürst <duerst@it.aoyama.ac.jp>
Date: Fri, 12 Feb 2016 16:24:45 +0900
To: Asmus Freytag <asmusf@ix.netcom.com>, <www-international@w3.org>
Message-ID: <56BD88BD.6070402@it.aoyama.ac.jp>
Just to make sure my correction doesn't get missed in the mail archives:

Here's what I just posted on github
(please reply on https://github.com/w3c/charmod-norm/issues/67, not here):

 >>>>>>>>>>>>>
This is what I originally posted (in e-mail):

 >>>>
I fully agree with John. I don't have any experience of being beaten
up by experts on that point, but then only because I never even got
the idea to make such a point.
 >>>>

I'm sorry for the delay in coming back to this issue, but I somehow 
misread John Klensin's comment, and got everything mixed up. As a 
result, I have to disagree with John, and agree with Asmus and Addison 
Phillips. The initial/medial/final/isolate distinction in Arabic is of 
quite a different nature than the casing distinctions in 
Latin/Greek/Cyrillic/... In addition to what has already been said, I'd 
like to mention that the Arabic calligraphy experts that I have been in 
interaction with, in particular Tom Milo, always insisted that the 
four-way distinction was a hopelessly crude approximation to what good 
typography and calligraphy for Arabic warranted.
 >>>>>>>>>>>>>

Regards,   Martin.

On 2016/02/05 03:15, Asmus Freytag wrote:
> On 2/4/2016 1:25 AM, Martin J. Dürst wrote:
>> On 2016/02/04 12:16, klensin via GitHub wrote:
>>> klensin has just created a new issue for
>>> https://github.com/w3c/charmod-norm:
>>>
>>> == Case Folding introduction (Section 2.1) ==
>>> It may not be relevant (or even, by other measures, correct), but I've
>>>   been beaten up several times by scholars of Arabic calligraphy who
>>> have claimed by any treatment of the distinction among initial,
>>> medial, final, and isolated forms as different from the distinction
>>> between upper, lower (and maybe title) case reflects a European script
>>>   bias and not actual relationships.
>>
>> I fully agree with John. I don't have any experience of being beaten
>> up by experts on that point, but then only because I never even got
>> the idea to make such a point.
>>
>> Regards,   Martin.
>
> I've responded on the git-hub as follows:
>
> I respectfully disagree with those scholars, and beating up people is
> not to be encouraged.
>
> For one, in terms of digital text representation, the various positional
> forms for Arabic (or Mongolian) characters are simply different glyphs;
> they are selected by the layout engine, and not encoded separately as
> characters. (Leaving aside the compatibility characters for Arabic that
> correspond to an earlier attempt and exist as an aid for emulators and
> other types of code museums).
>
> While there is a similarity, that in each case, around the concept of a
> "letter" there is a set of shapes that this letter can take on, "casing"
> represents of a subset: a bi-cameral script, as the name says, has two
> sets of forms for each letter, and the choice of form is not one of
> typography but of orthography, with conventions when to use each one
> that are based on the content of the text and the intent of the author.
>
> In contrast, the positional forms for cursively connected (and similar)
> scripts are determined solely (or primarily) by the nature of the
> adjacent letters.
>
> Also, the description in section 2.1 conforms to the definition of
> casing found elsewhere, e.g. in the Unicode Standard, and there's little
> to be gained to suddenly pretend that the term encompasses scripts that
> are not bi-cameral (but nevertheless have multiple shapes for the same
> letters).
>
> Finally, case folding requires that there be multiple code points for
> the same letter and that ignoring that distinction is a common process
> (Hiragana and Katakana are an example of two sets of shapes for the same
> sound values, which are not customarily folded, even though all users
> know which two form the set for the given sound).
>>
>>
>
>
> .
>
Received on Friday, 12 February 2016 07:25:32 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 21 September 2016 22:37:40 UTC