Re: [css3-text] script categories, 'bicameral', 'discrete', Unicode links and more

On 4/15/2011 1:54 AM, fantasai wrote:
> On 04/14/2011 11:26 AM, John Cowan wrote:
>> Leif Halvard Silli scripsit:
>>> I considered stating that she could investigate those scripts. But
>>> anyway, let us look at Limbu examples, since that is aparently what you
>>> have done:
>>> How do you come to that conclusion? Are you looking at the word spaces?
>>> Are the spaces result of adaptation to the "computer age"? Anyway,
>>> please note that "_and_ have discrete, unconnected (in print) units
>>> within words" is part of the discrete definition.
>> Very well: see p. 18 of 
>> This was printed by offset.
> The main point that distinguishes 'discrete' from 'connected' is that
> letter-spacing is allowed to be used for justification. 

I argue that this is not a script property.

In English you may find narrow columns that are typeset with 
letterspacing to make them justified. If you do the same in German, many 
readers will mistake this for an attempt at  e m p h a s i s. (It used 
to be more common, especially so during the age of Fraktur, but it's 
widespread enough that some people use it manually, like I did here, in 
internet postings).

Whether letter-spacing is "allowed" for justification depends thus not 
only on the script, but on (local) conventions. In the example I gave, 
letter-spacing is allowed for emphasis, but not for justification (the 
latter, if you attempted it, would look like a ransom note to readers 
who are used to interpret letterspacing as emphasis).

"letterspacing for justification" is not what you are after, but whether 
letterspacing as such is possible.

Even with that, you run into limitations with Latin, because some type 
styles, used with Latin, are clustered.

In typesetting German in Fraktur there are a number of required 
ligatures. These are not broken apart when letterspacing is applied (for 

I haven't yet had a chance to review your draft in detail, but if you 
haven't done so already, I suggest that you put a strong disclaimer 
somewhere that all this classification is suggestive of certain widely 
encountered practices, but that actual details for certain regions, 
languages, and even type styles may well differ.

Alternatively, you might consider not providing this breakdown in your 
spec, but to request Unicode to add comparable documentation to their 
description of scripts. The latter seems a much more workable place to 
maintain and update that kind of information.


> Cases to look
> at include
>   - lines that have no word separators, and thus can't be justified
>     that way
>   - mixtures with scripts such as CJK, where letter-spacing is
>     sometimes applied equally to discrete scripts during justification
> Similarly 'clustered' vs 'discrete' can be distinguished by what happens
> when you mix the two scripts.
> Leif says I should just list all the scripts in Unicode and categorize
> them. Great idea in theory. But in practice, I do not know enough about
> their typesetting behavior to make a correct categorization and do not
> have access to enough printed materials in all the scripts in Unicode
> to make an educated guess.
> ~fantasai

Received on Friday, 15 April 2011 19:21:39 UTC