- From: John Daggett <jdaggett@mozilla.com>
- Date: Mon, 3 Dec 2012 22:40:15 -0800 (PST)
- To: www-style list <www-style@w3.org>
The CSS3 Text spec defines a property 'text-justify', used to determine the style of justification when 'text-align: justify' is used. There is no normatively proposed justification algorithm, only a rough categorization of scripts into groups and then property values that assign different "priorities" to how expansion opportunities are ranked based on script. Line breaking and justification behavior in user agents today is dictated by the script and language. Western text uses a spacing model where line breaks and expansion occur at word spaces. For Japanese, line breaks occur everywhere within CJK script runs except where forbidden by explicit rules. For Thai, line breaking occurs at syllables, so a dictionary-based approach is needed to determine word/syllable boundaries. The use of ad-hoc script categories is highly problematic in my opinion. The categories are defined non-normatively in Appendix E and they are not in any way exhaustive enough to inform an implementor to know what to do in all cases. As John Hudson stated last year [1]: As it stands, the proposed classification criteria seems confused and to be based on an idiosyncratic analysis that ends up forcing closely related writing systems into different categories; there may be good reason for these divisions based on line-breaking needs, but for anyone familiar with more typical script analysis the use of familiar terms in strange ways is confusing, as are the implied groupings. For instance, under the categorisation criteria, Devanagari and Bengali would be considered 'connected scripts', while Gujarati and Oriya would be 'discrete scripts', despite that fact that all four scripts are closely related, have historically been analysed as local variants of the same writing system, and share important features that are ignored by the proposed classification criteria. The term cursive is problematic because virtually any writing system can and has been written in a cursive form, even nominal 'block scripts'. There are plentiful examples of cursive Latin script, and in many instances these are analysable as being at the same time cursive and discrete, since the letters within words retain their discrete isolated shapes are are linked by joining strokes that are not part of the letter. This in contrast to Arabic, in which the joining strokes are part of the letters, replacing other strokes that occur in the isolated forms. So the distinction between Latin and Arabic is that the latter is morphographical, while both may be written in cursive styles. [This also raises the issue of the degree to which nominal script-level decisions about line-breaking and justification can be safely applied to particular styles and particular fonts. If a justification model permits inter-character spacing adjustment of 'discrete' scripts, what is the effect on cursive font styles?] At the San Diego F2F I noted that the use cases for 'text-justify' values are very unclear in the spec [2]. An example was added of the line breaking with different values but that begs more questions than it answers. When is an author going to use one type of justification versus another? What differences in justification will they expect to see? If this is primarily based on the language conventions for a particular script, why does this need to be specified via a property value? User agents already do language-specific behavior, why does a property value need to be set in addition? I think property values for 'text-justify' need to address justification behavior explicitly rather than inferring that via script category prioritization. Here again, John Hudson put it best [1]: Again I come back to my previous point: if what the spec is trying to address is line-breaking and justification behaviour, coming at it from nominal script categorisation seems like a basic confusion of categories. We can get hung up on all sorts of concepts within grammatology, when really we don't need to if we instead start by defining line-breaking and justification behaviour types, and then look at how these map to individual scripts (with appropriate caveats or exceptions re. language, locale, style). That makes much more sense to me than starting by trying to categorise scripts according to unclear and non-discrete criteria and then trying to map these to line-breaking and justification behaviours. Start with the function. Since this property as defined right now seems like an experimental form, I would suggest defining only the behavior that's for which there's a clear use case and leave the others to be defined later: text-justify: auto | distribute where 'auto' is the user agent default behavior and 'distribute' means expand inter-letter and inter-word spacing equally. Regards, John Daggett [1] John Hudson on the problems of using script categorization for justification http://lists.w3.org/Archives/Public/www-style/2011Apr/0525.html http://lists.w3.org/Archives/Public/www-style/2011Apr/0518.html http://lists.w3.org/Archives/Public/www-style/2011Apr/0524.html http://lists.w3.org/Archives/Public/www-style/2011Apr/0526.html [2] http://lists.w3.org/Archives/Public/www-style/2012Aug/0897.html
Received on Tuesday, 4 December 2012 06:40:51 UTC