- From: fantasai <fantasai@escape.com>
- Date: Mon, 21 Apr 2003 12:42:55 -0400
- To: www-style@w3.org
# In the most general case, (assuming no hyphenation dictionary is
# available to the UA), a line break can occur only at white space
# characters or hyphens, including U+00AD SOFT HYPHEN.
This doesn't seem to match UAX 14.
# line-break: normal | string
#
# normal
# Selects the normal line breaking mode for CJK.
# strict
# Selects a more restrictive line breaking mode for CJK text.
...
# word-break-cjk: normal | break-all | keep-all
#
# normal
# Keeps non-CJK scripts together (according to their own rules),
# while Hangul and CJK ideographs... break according to the rules
# set by 'line-break' property.
# break-all
# Same as 'normal' for CJK ideographs and Hangul, but non-CJK
# scripts can break anywhere.
# keep-all
# Same as 'normal' for all non-CJK scripts. CJK ideographs and
# Hangul are kept together.
This organization of properties seems a bit.. non-optimal.
- 'line-break' is CJK-specific
- 'word-break-cjk' affects breaking in non-CJK
- 'line-break's functionality is tangled with 'word-break-cjk's
Suppose we did this:
line-break-cjk: normal | strict | word
line-break-general: normal | strict | anywhere
IMO this is much neater. We can control CJK-type scripts and other scripts
independently. Because of this, the purpose of the properties is also much
clearer.
For CJK-type scripts, line breaking is as follows:
line-break-cjk
normal - as for current "line-break: normal"
strict - as for current "line-break: strict"
word - as for current "word-break-cjk: keep-all"
For non-CJK, 'anywhere' selects the "break-anywhere" behavior of
"word-break-cjk: break-all" without affecting CJK scripts. 'normal' and
'strict' for non-cjk allows two different levels for line-breaking.
For example,
line-break-general
normal - as defined in UAX 14 for non-ideographic
strict - only break on spaces and other explicit opportunities like zwsp
anywhere - as for "word-break-cjk: break-all"
With this definition, 'strict' can be used to prohibit breaking after
hyphen-minus. It's also nice for formatting code with word wrap and
probably other things as well.
('strict' could also be left out, giving
line-break-general
normal - as defined in UAX 14 for non-ideographic
anywhere - as for "word-break-cjk: break-all")
# break-all
# Same as 'normal' for CJK ideographs and Hangul, but non-CJK scripts can
# break anywhere. This option is used mostly in a context where the text is
# predominantly using CJK characters with few non-CJK excerpts and it is
# desired that the text be better distributed on each line. The UAs may
# however limit the break everywhere behavior for script using clusters such
# as Thai.
The effect of "word-break-cjk: break-all" on the punctuation rules needs
to be explained. E.g. can there be a break between consecutive hyphens?
~fantasai
Received on Monday, 21 April 2003 12:43:17 UTC