Re: [css-text] I18N-ISSUE-313: Definition of grapheme clusters

On 26 June 2014 03:04, James Clark <jjc@jclark.com> wrote:

>
> On Fri, May 23, 2014 at 1:09 AM, Richard Ishida <ishida@w3.org> wrote:
>
>>
>> Another is a worry whether we can really effectively split the world into
>> semantically-perceived and visually-perceived characters - especially given
>> the 'etc' that appears in the definition where we list appropriate
>> operations for each. For example, are we sure that first-letter operations
>> require semantically- rather than visually-perceived characters in all
>> cases?  Where does cursor movement fit here? etc.
>> characters (eg. in the Thai case)?
>>
>
> The fundamental split, in my view, is between characters and glyphs. There
> are operations that are best understood as working on clusters of
> characters and there are operations that are best understood as working on
> clusters of glyphs.
>
> I would argue that cursor movement and line-breaking are character-level
> operations, whereas first-letter operations and letter-spacing are
> glyph-level operations.  For example, in Thai the boundary following a
> first-letter or the boundary where letter-space is to be inserted sometimes
> does not correspond to a boundary between characters.
>

And for some languages the boundary for first-letter may not correspond to
first character or to first grapheme cluster.

next week I hope to free enough time to play with javascript and see if i
can put together a script to detect first syllable of an element for a
couple of languages where it would be a useful alternative

A.
-- 
Andrew Cunningham
Project Manager, Research and Development
(Social and Digital Inclusion)
Public Libraries and Community Engagement
State Library of Victoria
328 Swanston Street
Melbourne VIC 3000
Australia

Ph: +61-3-8664-7430
Mobile: 0459 806 589
Email: acunningham@slv.vic.gov.au
          lang.support@gmail.com

http://www.openroad.net.au/
http://www.mylanguage.gov.au/
http://www.slv.vic.gov.au/

Received on Wednesday, 25 June 2014 23:59:32 UTC