csswg/css3-text Overview.html,1.104,1.105 Overview.src.html,1.202,1.203

Update of /sources/public/csswg/css3-text
In directory hutz:/tmp/cvs-serv23106

Modified Files:
	Overview.html Overview.src.html 
Log Message:
Rearrange text on justification expansion opportunities and script categories in response to Mark Davis's feedback. <http://lists.w3.org/Archives/Public/www-international/2011AprJun/0014.html>

Index: Overview.html
===================================================================
RCS file: /sources/public/csswg/css3-text/Overview.html,v
retrieving revision 1.104
retrieving revision 1.105
diff -u -d -r1.104 -r1.105
--- Overview.html	16 Apr 2011 06:15:59 -0000	1.104
+++ Overview.html	16 Apr 2011 07:11:35 -0000	1.105
@@ -89,7 +89,7 @@
     <dt>This version:
 
     <dd><a href="http://dev.w3.org/csswg/css3-text/Overview.html">$Date:
-     2011/04/13 08:45:47 $ (CVS $Revision$)</a> <!--
+     2011/04/16 06:15:59 $ (CVS $Revision$)</a> <!--
       <dd><a href="http://www.w3.org/TR/2011/WD-css3-text-20110416/">http://www.w3.org/TR/2011/WD-css3-text-20110416/</a></dd>
     -->
      
@@ -234,10 +234,6 @@
 
   <ul class=toc>
[...2006 lines suppressed...]
+        title="word-break:normal"><strong>5.2.</strong></a>
+
+       <li>word-spacing, <a href="#word-spacing0"
+        title=word-spacing><strong>9.1.</strong></a>
+
+       <li>word-wrap, <a href="#word-wrap0"
+        title=word-wrap><strong>7.2.</strong></a>
+
+       <li>word-wrap:break-word, <a href="#break-word"
+        title="word-wrap:break-word"><strong>7.2.</strong></a>
+
+       <li>word-wrap:hyphenate, <a href="#hyphenate"
+        title="word-wrap:hyphenate"><strong>7.2.</strong></a>
+
+       <li>word-wrap:normal, <a href="#normal3"
+        title="word-wrap:normal"><strong>7.2.</strong></a>
+      </ul>
+      <!--end-index--></div>
+    </div>
   </ul>

Index: Overview.src.html
===================================================================
RCS file: /sources/public/csswg/css3-text/Overview.src.html,v
retrieving revision 1.202
retrieving revision 1.203
diff -u -d -r1.202 -r1.203
--- Overview.src.html	16 Apr 2011 06:15:59 -0000	1.202
+++ Overview.src.html	16 Apr 2011 07:11:35 -0000	1.203
@@ -206,42 +206,6 @@
     the <em>legacy grapheme cluster</em> definition). The UA may further
     tailor the definition as allowed by Unicode.
 
-  <h3 id="script-groups">Script Categorization</h3>
-
-    <p>Typographic behavior varies somewhat by language, but varies drastically
-      by writing system. For convenience, CSS3 Text defines the following
-      script categories, which combine typographically-similar scripts together.
-
-    <dl>
-      <dt id="block-scripts"><dfn>block scripts</dfn></dt>
-        <dd>CJK (including Hangul and half-width kana) and by extension all
-          "wide" characters. (See [[!UAX11]])</dd>
-      <dt id="clustered-scripts"><dfn>clustered scripts</dfn></dt>
-        <dd>South-East Asian scripts that have discrete units but do not
-          use space between words (such as Thai, Lao, Khmer, Myanmar).
-          This category also includes the Tibetan script.</dd>
-      <dt id="discrete-scripts"><dfn>discrete scripts</dfn></dt>
-        <dd>Scripts that use spaces or visible word-separating
-          punctuation between words and have discrete,
-          unconnected (in print) units within words, such as Latin,
-          Greek, Ethiopic, Cyrillic, Hebrew.</dd>
-      <dt id="cursive-scripts"><dfn>cursive scripts</dfn></dt>
-        <dd>Arabic and similar cursive scripts.</dd>
-      <dt id="connected-scripts"><dfn>connected scripts</dfn></dt>
-        <dd>Devanagari, Ogham, and other scripts that use spaces between
-          words and baseline connectors within words.
-          By extension this category also includes Gurmukhi, Tamil and any
-          other Indic scripts whose typographic behavior is similar to
-          Devanagari.</dd>
-    </dl>
-
-    <p class="note">These definitions are used primarily in describing
-      <a href="#line-breaking">line-breaking</a> and
-      <a href="#text-justify">justification</a> behavior.
-
-    <p><a href="#script-categorization">Appendix E</a> provides a more
-      comprehensive listing of the various scripts in each category.
-
 <h2 id="conformance">
   Conformance</h2>
 
@@ -1035,9 +999,9 @@
         the text is predominantly using CJK characters with few non-CJK excerpts
         and it is desired that the text be better distributed on each line.</dd>
       <dt><dfn title="word-break:keep-all"><code>keep-all</code></dfn></dt>
-      <dd><a href="#block-scripts">Block</a> characters can no longer create
-        implied break points. Otherwise this option is equivalent to
-        ''normal''.
+      <dd>Lines may break only at <a href="#word-separator">word separators</a>
+        and other explicit break opportunities. Otherwise this option is
+        equivalent to ''normal''.
         This option is mostly used where the presence of word separator
         characters still creates line-breaking opportunities, as in Korean.</dd>
     </dl>
@@ -2110,57 +2074,11 @@
       group, but may vary within a line due to changes in the font or
       letter-spacing and word-spacing values. Since justification behavior
       varies by writing system, expansion opportunities are organized by
-      <a href="#script-groups">script categories</a>. The different types of
-      expansion opportunities are defined as follows:</p>
-
-    <dl>
-      <dt>spaces</dt>
-         <dd>An expansion opportunity exists at spaces and other
-           <a href="#word-separator">word separators</a>.
-           Expand as for <a href="#word-spacing">'word-spacing'</a>.</dd>
-      <dt>block</dt>
-      <dt>clustered</dt>
-      <dt>discrete</dt>
-         <dd>An expansion opportunity exists between two
-           <a href="#grapheme-cluster">grapheme clusters</a> when at least
-           one of them belongs to the affected script category and the spacing
-           that point has not already been altered at a higher priority.
-         </dd>
-      <dt>cursive</dt>
-         <dd>Words may be expanded through kashida elongation or other cursive
-           expansion processes. Kashida may be applied in discrete units or
-           continuously, and the prioritization of kashida points is UA-dependent:
-           for example, the UA may apply more at the end of the line. The
-           UA should not apply kashida to fonts for which it is inappropriate.
-           It may instead rely on other justification methods that lengthen
-           or shorten Arabic segments (e.g. by substituting in swash forms or
-           optional ligatures). Because elongation rules depend on the typeface
-           style, the UA should rely on on the font whenever possible rather
-           than inserting kashida based on a font-independent ruleset. The UA
-           should limit elongation so that, e.g. in multi-script lines a short
-           stretch of Arabic will not be forced to soak up too much of the
-           extra space by itself. If the UA does not support cursive elongation,
-           then no expansion points exist between grapheme clusters of these
-           scripts.</dd>
-      <dt id="punctuation-symbols">punctuation</dt>
-        <dd>An expansion opportunity exists between a pair of characters
-          from the Unicode symbols (S*) and punctuation (P*) classes and
-          at enabled <a href="#text-autospace">autospace</a> points.
-          <span class="issue">the relationship of expansion opportunity
-            and 'text-spacing' needs more review</span>
-          The default justification priority of these points depends on the
-          justification method as defined below; however there may be
-          additional rules controlling their justification behavior due to
-          typographic tradition.
-          For example, there are traditionally no expansion opportunities
-          between consecutive EM DASH U+2014, HORIZONTAL BAR U+2015, HORIZONTAL
-          ELLIPSIS U+2026, or TWO DOT LEADER U+2025 characters [[JLREQ]].
-          The UA may introduce additional levels of priority to handle expansion
-          opportunities involving punctuation.</dd>
-      <dt>connected</dt>
-        <dd>No expansion opportunities occur between pairs of connected script
-          grapheme clusters. <span class="issue">Is this correct?</span></dd>
-    </dl>
+      <a href="#script-groups">script categories</a>. In the table below,
+      An expansion opportunity exists between two
+      <a href="#grapheme-cluster">grapheme clusters</a> when at least
+      one of them belongs to the affected script category and the spacing
+      that point has not already been altered at a higher priority.
 
     <table class="data">
       <caption>Prioritization of Expansion Points</caption>
@@ -2364,6 +2282,47 @@
 
     <p class="note">The ''auto'' column defined above is informative.</p>
 
+    <p>For the expansion opportunities in the <i>cursive</i> category,
+      words may be expanded through kashida elongation or other cursive
+      expansion processes. Kashida may be applied in discrete units or
+      continuously, and the prioritization of kashida points is UA-dependent:
+      for example, the UA may apply more at the end of the line. The
+      UA should not apply kashida to fonts for which it is inappropriate.
+      It may instead rely on other justification methods that lengthen
+      or shorten Arabic segments (e.g. by substituting in swash forms or
+      optional ligatures). Because elongation rules depend on the typeface
+      style, the UA should rely on on the font whenever possible rather
+      than inserting kashida based on a font-independent ruleset. The UA
+      should limit elongation so that, e.g. in multi-script lines a short
+      stretch of Arabic will not be forced to soak up too much of the
+      extra space by itself. If the UA does not support cursive elongation,
+      then no expansion points exist between grapheme clusters of these
+      scripts.
+
+    <p class="note">
+      No expansion opportunities occur between pairs of connected script
+      grapheme clusters. <span class="issue">Is this correct?</span>
+
+    <p>The <dfn title="spaces-category">spaces</dfn> category defines an
+      expansion opportunity at spaces and other
+      <a href="#word-separator">word separators</a>. (See
+      <a href="#word-spacing">'word-spacing'</a>.)
+    <p>The <dfn id="punctuation-symbols">punctuation</dfn> category defines
+      the expansion opportunities exists between any pair of characters
+      from the Unicode symbols (S*) and punctuation (P*) classes and
+      at enabled <a href="#text-autospace">autospace</a> points.
+          <span class="issue">the relationship of expansion opportunity
+            and 'text-spacing' needs more review</span>
+      The default justification priority of these points depends on the
+      justification method as defined below; however there may be
+      additional rules controlling their justification behavior due to
+      typographic tradition.
+      For example, there are traditionally no expansion opportunities
+      between consecutive EM DASH U+2014, HORIZONTAL BAR U+2015, HORIZONTAL
+      ELLIPSIS U+2026, or TWO DOT LEADER U+2025 characters [[JLREQ]].
+      The UA may introduce additional levels of prioritization to handle
+      expansion opportunities involving punctuation.</dd>
+
     <p>The UA may enable or break optional ligatures or use other font
       features such as alternate glyphs to help justify the text under
       any method. This behavior is not defined by CSS.</p>
@@ -4339,31 +4298,43 @@
       please send the information to <a href="mailto:www-style@w3.org">www-style@w3.org</a>
       with <kbd>[css3-text]</kbd> in the subject line.</p>
 
-<h2 class="no-num" id="script-categorization">Appendix E: Categorization of Scripts</h2>
+<h2 class="no-num" id="script-groups">Appendix E: Scripts and Spacing</h2>
 
-<p><em>This appendix is informative (non-normative).</em></p>
+  <p><em>This appendix is informative (non-normative).</em></p>
 
-<p>This appendix categorizes some common scripts in Unicode 6.0 according
-to the <a href="#script-groups">categorization given above</a>.
+  <p>Typographic behavior varies somewhat by language, but varies drastically
+    by writing system. This appendix categorizes some common scripts in
+    Unicode 6.0 according to their justification and spacing behavior. Category
+    descriptions are descriptive, not prescriptive; the determining factor is
+    the prioritization of <i>expansion opportunities</i>.
 
-<dl>
-  <dt>block scripts</dt>
-    <dd>
-      Bopomofo,
-      Han,
-      Hangul,
-      Hiragana,
-      Katakana,
-      Yi
-  <dt>clustered scripts</dt>
-    <dd>
-      Khmer,
-      Lao,
-      Myanmar,
-      Thai
+  <dl>
+    <dt id="block-scripts"><dfn>block scripts</dfn></dt>
+      <dd>CJK and by extension all Wide characters. (See [[!UAX11]])
+        The following scripts are included:
+        Bopomofo,
+        Han,
+        Hangul,
+        Hiragana,
+        Katakana,
+        Yi
+      </dd>
+    <dt id="clustered-scripts"><dfn>clustered scripts</dfn></dt>
+      <dd>Scripts that have discrete units but do not use spaces between words,
+        such as many Southeast Asian systems.
+        The following scripts are included:
+        Khmer,
+        Lao,
+        Myanmar,
+        Thai,
       <span class="issue">This list is likely incomplete. What else fits here?</span>
-  <dt>connected scripts</dt>
-    <dd>
+      </dd>
+    <dt id="connected-scripts"><dfn>connected scripts</dfn></dt>
+      <dd>Devanagari, Ogham, and other scripts that use spaces between
+        words and baseline connectors within words.
+        By extension this category also includes any other Indic scripts
+        whose typographic behavior is similar to Devanagari.
+      The following scripts are included:
       Bengali,
       Brahmi,
       Devanagari,
@@ -4373,17 +4344,23 @@
       Malayalam,
       Oriya?,
       Ogham,
-      Tamil,
+      Tamil?,
       Telugu
-  <dt>cursive scripts</dt>
-    <dd>
+      </dd>
+    <dt id="cursive-scripts"><dfn>cursive scripts</dfn></dt>
+      <dd>Arabic and similar inherently cursive scripts.
+      The following scripts are included:
       Arabic,
       Mongolian,
       N'Ko?,
       Phags Pa?,
       Syriac
-  <dt>discrete scripts</dt>
-    <dd>
+      </dd>
+    <dt id="discrete-scripts"><dfn>discrete scripts</dfn></dt>
+      <dd>Scripts that use spaces or visible word-separating
+        punctuation between words and have discrete,
+        unconnected (in print) units within words.
+      The following scripts are included:
       Armenian,
       Bamum?,
       Braille,
@@ -4402,14 +4379,34 @@
       Shavian,
       Tifinagh,
       Vai?
+      </dd>
 </dl>
 
 <p>UAs should treat unrecognized scripts as <i>discrete</i>.
 
 <p class="issue">This listing should ideally be exhaustive wrt Unicode.
 Please <a href="#status">send</a> suggestions and corrections to the CSS
-Working Group. (As described <a href="#script-groups">above</a>, the
-grouping is primarily based on justification behavior.)</p>
+Working Group.</p>
+
+<div class="note">
+  <p>Guidelines for classification consider letter-spacing and justification:
+  <ol>
+    <li>If the script is cursive and may expand cursively but must not
+      space between letters, it is <i>cursive<i>.
+    <li>If ths script primarily flexes word separators, it is either
+      <i>discrete<i> or <i>connected</i>. <i>Discrete</i> scripts can
+      space between letters. <i>Connected</i> scripts must not space
+      between letters (typically because that would break the connections
+      or otherwise look bad).
+    <li>If the script primarily expands equally between its grapheme
+      clusters in native typesettings, it is either <i>block</i> or
+      <i>clustered</i>. The exact classification depends on whether it
+      always spaces when mixed with CJK and sometimes stays together
+      when mixed with Thai and related scripts (<i>block</i>) or
+      sometimes spaces when mixed with CJK and always spaces with Thai
+      (<i>clustered</i>).
+  </ol>
+<div>
 
 <h2 class="no-num">Appendix F: Full Property Index</h2>
 <!-- properties -->

Received on Saturday, 16 April 2011 07:11:40 UTC