[css3-text] script categories, 'bicameral', 'discrete', Unicode links and more

fantasai, Wed, 13 Apr 2011 19:36:44 -0700:

>   http://www.w3.org/TR/css3-text/

Congrats.
 
> This module covers, among other things,
  [...]
>   * Text decoration
> 
> The latest revision in particular attempts to classify the scripts
> in Unicode according to their typographic behavior.
  [...]
>   http://www.w3.org/TR/css3-text/#script-categorization

Looking at

http://dev.w3.org/csswg/css3-text/#script-categorization

then:

* Nit: There lacks a comma between 'Ethiopian' and 'Greek'
* Nit: Could the script categories be listed in the same order
  in #script-categorization as in #script-groups? (Or opposite.)
* Suggestion: HTML5 provides spec-internal links for any term
  that HTML5 itself defines. Could this spec follow suit? Thus 
  perhaps the terms 'discrete scripts', 'block script' etc
  could link back to the definition found in #script-groups ?
* Bicameral: Is there bicameral scripts that aren't discrete? If
  not, could you, instead of listing all the bicameral scripts,
  simply point to either a definition of the term 'bicameral' 
  and/or list of all the bicameral scripts somewhere else
  in the spec? [see more on bicameral/unicameral below]
* Clustered: a low hanging fruit is that #script-groups also 
  mentions Tibetan script. Please add it to #script-categorization
* Clustered: Wikipedia says that Tibetan script has
  influenced the scripts Limbu, Lepcha and 'Phags-pa - they are
  thus probably clustered as well. All 3 of them are also listed 
  at http://unicode.org/charts/script/ See Wikipedia:
  http://en.wikipedia.org/wiki/Tibetan_script#firstHeading
* Discrete: Unicode chapter '5.18 Case Mappings' tells that 
  Georgian *has been* bicameral. Which reminds me that the list of
  discrete scripts should list Georgian (which remains discrete 
  despite that that it is now unicameral.)
  http://www.unicode.org/versions/Unicode6.0.0/ch05.pdf#G21180
* Discrete: A look at the Unicode Case charts tells me that
  Glagolitic is also a bicameral script. 
  http://unicode.org/charts/case/
* If the listing should be 'exhaustive wrt Unicode' then simply
  dropping all the scripts that Unicode lists, into the 
  appropriate category list, would be a good start:
  http://unicode.org/charts/script/

The word 'bicameral' occurs in the draft: [1]

]] The case mapping rules for the character repertoire specified by the 
Unicode Standard can be found on the Unicode Consortium Web site. 
[UNICODE] Only characters belonging to bicameral scripts are affected. 
[[

Comments:

* Instead of "Only characters belonging to bicameral scripts are 
affected" you could say "Scripts that operate with case mapping rules 
are known in as 'bicameral' scripts" and link to the spec's (future) 
definition of 'bicameral').
* Most users of bicameral scripts don't have the slightest clue what 
'bicameral' means. Please define it. Or at least point to a definition. 
The word means 'two-chamber' ... Thus it relates to (upper-/lower-)case.
* Perhaps 'bicameral' and 'unicameral' could be a new (sub) section of 
for example #script-groups ? 
* Unicode chapter '5.18 Case Mappings' has a bicameral/unicameral
  definition: "Alphabets with case differences are called 
  bicameral; those without are called unicameral." Et cetera.
  http://www.unicode.org/versions/Unicode6.0.0/ch05.pdf#G21180
* Unless you define it well enough in this spec, then rather than only 
pointing to [UNICODE] in a vague way, can you provide a link to 
Unicode's Case Charts? Or at least use the wording 'Unicode Case 
charts' so readers may easily Bing/Yahoo/Google it themselves? 
http://unicode.org/charts/case/

UNICODE linking:

* In general, when referring to Unicode, could useful links bet 
provided rather than a text which speaks about Unicode version 4? See: 
http://dev.w3.org/csswg/css3-text/#UNICODE
* Please use the Unicode last version link, if it is intended that
  it is the last version that matters: 
  http://www.unicode.org/versions/latest/
* As a reader, here are some of the kinds of links to Unicode that are 
useful to me:
  - Link to the bookmarks section of the latest version
    http://www.unicode.org/versions/latest/bookmarks.html
  - Link to the sub sections - by using the links in the
    bookmarks page above
  - Links to online charts: http://unicode.org/charts/
  - some useful HTML sub pages, such as 
    http://unicode.org/charts/case/ and
    http://unicode.org/charts/script/
    are unfortunately not linked to from 
http://unicode.org/charts/case/ 

> [...] Input from the Indic, Southeast
> Asian, and Arabic script communities so far has been noticeably missing
> and would be especially appreciated.

An argument for having a section about 'bicameral' vs 'unicameral' 
scripts is that this helpsin in defining those scripts too.

[1] http://www.w3.org/TR/css3-text/#text-transform
-- 
leif halvard silli

Received on Thursday, 14 April 2011 13:44:32 UTC