RE: comments on Character Model for the World Wide Web: String Matching and Searching from Phillips, Addison on 2014-06-19 (www-international@w3.org from April to June 2014)

From: Phillips, Addison <addison@lab126.com>
Date: Thu, 19 Jun 2014 22:13:34 +0000
To: Matitiahu Allouche <matitiahu.allouche@gmail.com>, "'Asmus Freytag'" <asmusf@ix.netcom.com>, "'Najib Tounsi'" <ntounsi@emi.ac.ma>, "www-international@w3.org" <www-international@w3.org>
Message-ID: <7C0AF84C6D560544A17DDDEB68A9DFB5246FCE62@ex10-mbx-36009.ant.amazon.com>

I actually changed our document to say:

Presentation forms of Arabic (initial, medial, final, isolated)


> -----Original Message-----
> From: Matitiahu Allouche [mailto:matitiahu.allouche@gmail.com]
> Sent: Thursday, June 19, 2014 2:25 PM
> To: Phillips, Addison; 'Asmus Freytag'; 'Najib Tounsi'; www-
> international@w3.org
> Subject: RE: comments on Character Model for the World Wide Web: String
> Matching and Searching
> 
> Given the inputs from Najib and Asmus, I withdraw my comment and agree
> that the Arabic shapes are a more appropriate example.  However, I am not
> sure that the title "Cursive forms" is best. I still think that cursiveness is not the
> main point here. Something like "Position-dependent forms" seems better
> IMHO (and UAX#15 is not the ultimate truth).
> --
> Shalom (Regards),  Mati
> 
> 
> -----Original Message-----
> From: Phillips, Addison [mailto:addison@lab126.com]
> Sent: Thursday, June 19, 2014 10:44 PM
> To: Asmus Freytag; Najib Tounsi; Matitiahu Allouche; www-
> international@w3.org
> Subject: RE: comments on Character Model for the World Wide Web: String
> Matching and Searching
> 
> >
> > On 6/19/2014 11:27 AM, Najib Tounsi wrote:
> > > On 6/19/14 2:51 PM, Matitiahu Allouche wrote:
> > >>
> > >> 11) In 2.2 table of Compatibility Equivalence, the third example is
> > >> labelled "Cursive forms". I think that this would be better
> > >> labelled "character shapes". Rationale: the example shows various
> > >> shapes of an Arabic letter. But similar examples could be taken
> > >> from final versus non-final shapes of some Hebrew letters, or from
> > >> the final versus non-final shapes of the Greek sigma letter. Hebrew
> > >> and Greek are not cursive scripts, so the issue here is having
> > >> position-dependent shapes, not cursiveness.
> >
> > The Greek final sigma uses a different character code which is not a
> > compatibility equivalent.
> >
> > The reason is that, unlike Arabic positional shaping, the selection of
> > the final form cannot be determined algorithmically at rendering time
> > and would otherwise introduce the need to use ZWNJ with Greek; not a good
> tradeoff.
> >
> > Whatever example is used needs to be limited to cases of automatic
> > shape selection at rendering.
> >
> 
> Context matters here. The table is not merely one containing characters that
> use contextual shaping. These are *specifically* characters with compatibility
> decompositions in Unicode and the table is illustrating the various kinds of
> compatibility decomposition. I tend to agree with Mati's comment that "cursive
> forms" is not that accurate a label. In practice only Arabic uses <initial>,
> <medial>, <final>, and <isolated> decompositions, though, so the other offered
> examples are not what the table is meant to illustrate. The items in the table
> are the four compatibility variations of ARABIC LETTER NOON (U+0646).
> 
> Note that this table is identical to Figure 2 in UAX#15.
> 
> Addison

Received on Thursday, 19 June 2014 22:14:04 UTC