CSS3-Text: Text wrapping and White-space control, II from fantasai on 2003-05-04 (www-style@w3.org from May 2003)

From: fantasai <fantasai@escape.com>
Date: Sun, 04 May 2003 10:25:11 -0400
To: www-style@w3.org
Message-ID: <3EB522C7.7050406@escape.com>
These comments are more from an editorial slant.

CSS3 Text, Section 7
http://www.w3.org/TR/2003/WD-css3-text-20030226/#text-wrapping

Overall
-------

   Move the section on text wrap after the section on white-space control, since
   wrapping logically happens after white space processing.

   Move the section on 'white-space-treatment' before the section on
   'linefeed-treatment' because white-space discard happens before linefeed
   transform.

     # The following section frequently uses the term line feed character to
     # specify the normalized newline indicator. In XML and HTML context, the
     # line feed character is the LINE FEED (U+000A). In other contexts, it
     # may be represented differently, for example by a CARRIAGE RETURN
     # (U+000A). The term 'line feed character' represents the normalized
     # newline character native to a given framework.

   Why don't you just use the term 'newline character' throughout? Your intent
   would be clearer that way.


Overview (7.)
-------------

     # Text wrapping and white space handling are interrelated through the CSS2
     # 'white-space' property combining these two effects together.

   change to

     Text wrapping and white-space handling are interrelated through the
     'white-space' property, which controls these two effects together.

     # Text wrapping and text overflow both deal with situation where the text
     # reaches the flow after-edge of its containing box.

   I don't think this sentence is necessary. Take it out.

     # CSS3 clearly separates these three effects in different sets of property
     # while keeping the 'white-space' property for compatibility reasons.

   I don't think this is particularly useful either, but you should change
   "for compatibility reasons" to "as a shorthand". It's useful to keep beyond
   compatibility reasons: I certainly don't want to set all these silly
   'ignore-if-before-or-after-line-feed' values every time I want to specify
   "white-space: normal".

   Also, "in different sets of property" -> "into different properties"

Defining White Space (7.2)
--------------------------

     # ...
     #
     # The amount of white space processing that can be achieved by a user
     # agent that supports CSS is directly related to the CSS processing
     # model, especially the document parsing and validation. After parsing
     # and possible validation, the document tree may contain text nodes
     # that contain unprocessed white space characters, or the document tree
     # may already have been processed in a way that white space characters
     # have been collapsed and partially removed (white space normalization).
     #
     # In that respect, the CSS properties related to white space processing
     # can only be effective if the CSS processor has access to the white
     # space characters that were originally encoded in the document. However,
     # end-of-line characters are typically handled (like by XML processors)
     # in such a way that any arbitrary combination of end-of-line characters
     # is replaced by a single line feed character.
     #
     # Note: The first version of XML [XML1.0] only normalizes two characters
     # sequences of (U+000D U+000A) or any U+000D not followed by U+000A to a
     # single U+000A. The forthcoming version of XML [XML1.1] adds U+0085 (NEL)
     # and U+2028 (LINE SEPARATOR) to the line feed normalization process.
     # However the set of white space characters is unchanged. Notably, the
     # character U+2029 (PARAGRAPH SEPARATOR) is not part of that set. If the
     # characters U+2028 and U+2029 appears in text, they are treated as zero-
     # width characters without semantic meaning.
     #
     # Note: XML Schema, through its 'whiteSpace' facet can constrain exactly
     # the type of white space characters still available to a rendering
     # process like CSS for elements containing string datatype. In addition,
     # some XML languages like [XHTML1.0] may have their own white space
     # processing rules when parsing and validating documents with white space
     # characters. Therefore, some of the behaviors described below may be
     # affected by these limitations and may be user agent dependent in these
     # contexts...

This text is repetitive and includes a lot of extraneous info. I'm going to
conjecture and say that the whole section should be rewritten as

   White space processing in CSS is the method by which white space
   characters are interpreted for rendering; it has no effect on the
   underlying document data. In the context of CSS, the white space
   set is defined to be any space characters (Unicode value U+0020),
   tab characters (U+0009), or newline characters (typically line
   feed, U+000A). The characters U+000D (CARRIAGE RETURN), U+0085
   (NEL), U+2028 (LINE SEPARATOR), and U+000A (LINE FEED), if they
   are not defined as the newline character, are [ignored | treated
   as zero-width space (U+200B) | something else?].

   The document parser should normalize newline sequences according to
   its format rules before CSS processing takes effect. However, in
   generated content strings the line feed character (U+000A) and only
   the line feed character is considered a newline sequence. This way,
   style rules behave consistently across systems.

   Note: The document parser may have not only normalized newline
   characters, but also collapsed other space characters or otherwise
   processed white space according to markup rules. Because CSS
   processing occurs *after* the parsing stage, it is not possible to
   restore these characters for styling. Therefore, some of the
   behavior specified below can be affected by these limitations and
   may be user agent dependent.


White-space Processing Rules
----------------------------

     #       2. any space (U+0020) following another space (U+0020)--even a
     #          space before the inline, if that space also has
     #          'all-space-treatment' set to collapse--is removed.

   change "even a space" to "which could be a space"

     # 2. All tabs (U+0009) are rendered as a horizontal shift that lines
     #    up the start edge of the next glyph with the next tab stop. Tab
     #    stops occur at points that are multiples of 8 times the width
     #    of a space (U+0020) rendered in the block's font from the block's
     #    starting content edge.
     # 3. If a space (U+0020) at the end of a line has 'all-space-treatment'
     #    set to 'collapse', it is also removed.
     #
     # Note: Tab stops line up in the block regardless of font change.

   Move the note into 2, like this:
     2. All tabs... from the block's starting content edge. (Thus tab stops
        line up in the block regardless of font change.)

   You can take out the parentheses if you prefer.


     # These rules do not apply to elements that have an explicit white-space
     # rendering behavior (like the pre element in XHTML).

   Change to

     Implementations may ignore these rules on elements that have a markup-
     defined white-space rendering behavior (like the pre element in XHTML).


     # The 'white-space' property is a shorthand property for
     # 'linefeed-treatment', 'white-space-treatment', 'all-space-treatment'
     # and 'wrap-option'.

   This sentence kinda appears out of nowhere. It belongs under the description
   for 'white-space', where it should either replace or supplement "This property
   declares how 'white-space' inside the element is handled"'.

wrap-option
-----------

   # This property controls whether or not text wraps when it reaches
   # the flow edge of its containing block box. Several value
   # descriptions use the term preserved line feed characters. A
   # preserved line feed character (either from the source content or
   # from occurrence of "\A" in generated content) is maintained for
   # presentation purpose and may therefore influence text wrapping.
   # The preserved status of line feed characters is determined by the
   # 'linefeed-treatment' property. The 'wrap-option' possible values
   # are:

   replace with

     This property controls whether and how text wraps when it reaches
     the flow edge of its containing block box. In all cases a preserved
     newline character (one that hasn't been transformed by
     'linefeed-treatment') forces a line break.

   # wrap
   #   The text is wrapped at the best line-breaking
   #   opportunity (if required) within the available
   #   block inline-progression dimension (block width
   #   in horizontal text flow). The best line-breaking
   #   opportunity is determined in priority by the
   #   existence of preserved line feed characters, or
   #   by the line-breaking algorithm controlled by the
   #   'line-break' and word-break' properties.

   replace with

     Lines wrap at the best line-breaking opportunity as
     determined by the XXX properties

   where XXX is the line breaking properties

   # no-wrap
   #   The text is only wrapped where explicitly specified
   #   by preserved line feed characters. ...

   replace with

     Lines only break where explicity specified by preserved
     newline characters. ...

linefeed-treatment
------------------

   # auto
   #   The user agent either transforms each line feed character to a space
   #   character (U+0020), transforms each line feed character to a zero width
   #   space character (U+200B), or removes the line feed characters, following
   #   the line feed conversion algorthim. The choice of the resulting
   #   character is conditioned by the script value of the characters preceding
   #   and following the line feed character which are part of the same inline
   #   text flow in the same block element. The script value of each character
   #   is determined by the 'text-script' property.

   rewrite as

      The user agent transforms each newline character according to the newline
      conversion algorithm.

   or, if you prefer,

      The user agent transforms each newline character into a word separator as
      decribed in the newline conversion algorithm.

   Also, remove the "for rendering purpose" from the descriptions for
   'treat-as-space' and 'treat-as-zero-width-space' and either change
   "can be treated" to "might be treated" or take the sentence out.
   (The order of processing is already described in the White Space
   Processing rules.)

   # immediately follows a linefeed character, shall be discarded

   Take out the comma.

   # Note: The Unicode Standard [UNICODE] specifies that the zero width space
   # is considered a valid line-break point and that if two characters with a
   # zero width space in between are placed on the same line they are placed
   # with no space between them; and that if they are placed on two lines no
   # additional glyph area, such as for a hyphen, is created at the line-break.

   This breaks the flow from the 'linefeed-treatment' values to the conversion
   algorithm. Move the note to after the algorithm. You can also shorten it to

     Note: According to the Unicode Standard [UNICODE], the zero width space
     is a valid line breaking opportunity. It has no advance width.

   # This action shall take place regardless of the setting of the
   # linefeed-treatment property.

   This is written into the white space processing rules, but if you want to
   mention it here, write it into the main description instead of copying it
   under every value definition.

   Your verb-object forms don't agree in
    # Specifies that any white space characters, except for linefeeds, that
    # precedes/follows

   The subject is "characters", so the verb should be "precede" or "follow".
   (no 's')

white-space-treatment
---------------------

   # This property specifies the treatment for rendering purpose of
   # the space character (U+0020) and other white space characters
   # (except for line feed characters, since their treatment is
   # determined by the 'linefeed-treatment' property).

   change to

     This property specifies which white space characters, other than
     newline characters, are to be rendered. (Newlines are handled by
     the 'linefeed-treatment' property.)

all-space-treatment
-------------------

   # The 'all-space-treatment' property specifies the treatment of all
   # consecutive white-space characters (with no exception for linefeed
   # characters, unlike the 'white-space-treatment' property). Values
   # have the following meanings:

   rewrite as

     The 'all-space-treatment' property specifies whether consecutive
     white space characters collapse. It takes effect after
     'white-space-treatment' and 'linefeed-treatment' as described in
     the White Space Processing rules.

   The name might be inane, but that doesn't mean the description should
   be awkwardly worded to justify it.

BTW, you have a tendency to comment in too much information when refering
to another document in the Notes. You need to explain why the document is
relevant--i.e. what sort of information do we need from there--but you
don't need to write a synopsis.

~fantasai
Received on Sunday, 4 May 2003 10:24:19 UTC