W3C home > Mailing lists > Public > www-style@w3.org > January 2014

[css-text] I18N-ISSUE-309: Haphazard use of term 'character'

From: Phillips, Addison <addison@lab126.com>
Date: Fri, 24 Jan 2014 18:16:14 +0000
To: "CSS WWW Style (www-style@w3.org)" <www-style@w3.org>
CC: www International <www-international@w3.org>
Message-ID: <7C0AF84C6D560544A17DDDEB68A9DFB517C8E717@ex10-mbx-36009.ant.amazon.com>
    OPEN WG Comment
Raised by:
    Addison Phillips
Opened on:
    Several members of the I18N WG are concerned about the following paragraph and subsequent use of the term "character" in the current draft:

    Within this specification, the ambiguous term character is used as a friendlier synonym for grapheme cluster. See Characters and Properties for how to determine the Unicode properties of a character.

    While "character" is a friendlier synonym and other CSS documents do the same "redefinition" for convenience of Spec users, this particular spec really does need to distinguish more clearly between character-as-a-term-or-art and character-as-a-grapheme-synonym.

    For example, slightly after the definition above we see the first "ambiguous" use of "character":

    The rendering characteristics of a character divided by an element boundary is undefined: it may be rendered as belonging to either side of the boundary, or as some approximation of belonging to both. Authors are forewarned that dividing grapheme clusters by element boundaries may give inconsistent or undesired results.

    We would be happier if the words "character" and "grapheme cluster" (particularly in this instance) were used more strictly. As written, this paragraph is not completely clear. An illustration of "a character" being so divided might help. Other cases exist where it isn't clear whether grapheme clusters or code points are intended.

NOTE: This is a substantial comment that the I18N WG feels strongly about: the use of terminology in this document is fundamental to getting the implementation correct and the use of 'character' with multiple conflicting meanings is problematic.

Received on Friday, 24 January 2014 18:17:00 UTC

This archive was generated by hypermail 2.4.0 : Monday, 23 January 2023 02:14:36 UTC