- From: Paul Grosso <pgrosso@arbortext.com>
- Date: Thu, 07 Aug 2003 10:43:22 -0500
- To: Karen Lease <klease@club-internet.fr>
- Cc: xsl-editors@w3.org
Karen, Thanks for your continued interest in XSL and specifically getting to the bottom of whitespace handling issues. Needless to say, this topic has taken a lot of resources and time to investigate and (hopefully) resolve. Anyone who has ever worked with SGML, XML, and/or stylesheets knows how tricky "white space issues" can be. Your comment at http://lists.w3.org/Archives/Public/xsl-editors/2002OctDec/0013 prompted a discussion within the FO Subgroup, highlighting an imperfect overlap between the Line-building section and the white-space-treatment and suppress-at-line-break properties. We decided to move the processing of the white-space-treatment property into section 4.7.2 (Line-building) and to modify the properties slightly for greater flexibility going forward, and to rescind the erratum that created a new property relating to suppress-at-line-break, which is now unnecessary. We did not find it necessary to move the processing of the white-space-collapse or linefeed-treatment properties. The newly reworded 4.7.2 and the rewritten property definitions 7.15.8 and 7.16.3 [below] now clarify the relations of these properties to Line-building. Some whitespace handling happens in refinement and some happens in area generation, and this change moves some processing that was happening in refinement (processing of the white-space-treatment and suppress-at-line-break properties) into area generation (4.7.2 line-building). We believe this is the best way to address the complexities of whitespace handling and allows for implementations and users to get all the control they need. ================================================ 4.7.2 Line-building This section describes the ordering constraints that apply to formatting an fo:block or similar block-level object. A block-level formatting object F which constructs lines does so by constructing block-areas which it returns to its parent formatting object, and placing normal areas and/or anchor areas returned to F by its child formatting objects as children of those block-areas or of line-areas which it constructs as children of those block-areas. For each such formatting object F, it must be possible to form an ordered partition P consisting of ordered subsets S1, S2, ..., Sn of the normal areas and anchor areas returned by the child formatting objects, such that the following are all satisfied: 1. Each subset consists of a sequence of inline-areas, or of a single block-area. 2. The ordering of the partition follows the ordering of the formatting object tree. Specifically, if A is in Si and B is in Sj with i < j, or if A and B are both in the same subset Si with A before B in the subset order, then either A is returned by a preceding sibling formatting object of B, or A and B are returned by the same formatting object with A being returned before B. 3. The partitioning occurs at legal line-breaks. Specifically, if A is the last area of Si and B is the first area of Si+1, then the rules of the language and script in effect must permit a line-break between A and B, within the context of all areas in Si and Si+1. 4. Forced line-breaks are respected. Specifically, if C is a descendant of F, and C is a fo:character whose Unicode character is U+000A, and A is the area generated by C, then either C is a child of F and A is the last area in a subset Si, or C is a descendant of a child C' of F, and A ends (in the sense of 4.2.5) an area A' returned by C' , such that A' is the last area in a subset Si. 5. The partition follows the ordering of the area tree, except for certain glyph substitutions and deletions. Specifically, if B1, B2, ..., Bp are the normal child areas of the area or areas returned by F, (ordered in the pre-order traversal order of the area tree), then there is a one-to-one correspondence between these child areas and the partition subsets (i.e. n = p), and for each i, * Si consists of a single block-area and Bi is that block-area, or * Si consists of inline-areas and Bi is a line-area whose child areas are the same as the inline-areas in Si, and in the same order, except that where the rules of the language and script in effect call for glyph-areas to be substituted, inserted, or deleted, then the substituted or inserted glyph-areas appear in the area tree in the corresponding place, and the deleted glyph-areas do not appear in the area tree. For example, insertions and substitutions may occur because of addition of hyphens or spelling changes due to hyphenation, or glyph image construction from syllabification, or ligature formation. Deletions occur as specified in (6.), below. 6. white-space-treatment is enforced. In particular, deletions in (5.) occur when there is a glyph area G such that (a.) the white-space-treatment of G is "ignore" and the character of G is classified as white space in XML; or (b.) the white-space-treatment of G is "ignore-if-before-linefeed" or "ignore-if-surrounding-linefeed", the suppress-at-line-break of G is "suppress", and G would end a line-area; or (c.) the white-space-treatment of G is "ignore-if-after-linefeed" or "ignore-if-surrounding-linefeed", the suppress-at-line-break of G is "suppress", and G would begin a line-area. In these cases the area G is deleted; this may cause the condition in clauses (b.) or (c.) to become true and lead to further deletions. Substitutions that replace a sequence of glyph-areas with a single glyph-area should only occur when the margin, border, and padding in the inline-progression-direction (start- and end-), baseline-shift, and letter-spacing values are zero, treat-as-word-space is false, and the values of all other relevant traits match (i.e., alignment-adjust, alignment-baseline, color trait, background traits, dominant-baseline-identifier, font traits, text-depth, text-altitude, glyph-orientation-horizontal, glyph-orientation-vertical, line-height, line-height-shift-adjustment, text-decoration, text-shadow). NOTE: Line-areas do not receive the background traits or text-decoration of their generating formatting object, or any other trait that requires generation of a mark during rendering. ---------------------------------- 7.15.8 white-space-treatment The values have the following meanings: ignore Any glyph-area whose Unicode character is classified as white space in XML, except for U+000A, shall be deleted during line-building and inline-building (see 4.1.6 and 4.2.6). preserve Any glyph-area whose Unicode character is classified as white space in XML shall not be deleted during line-building and inline-building. ignore-if-before-linefeed Any glyph-area with a suppress-at-line-break value of 'suppress' shall be deleted during line-building and inline-building if it would be the last glyph-area descendant of a line-area. ignore-if-after-linefeed Any glyph-area with a suppress-at-line-break value of 'suppress' shall be deleted during line-building and inline-building if it would be the first glyph-area descendant of a line-area ignore-if-surrounding-linefeed Any glyph-area with a suppress-at-line-break value of 'suppress' shall be deleted during line-building and inline-building if it would be the first or last glyph-area descendant of a line-area ---------------------------------- 7.16.3 suppress-at-line-break The property has the following values: auto the value is determined by the Unicode value of object's character property. The character at code point U+0020 is treated as if 'suppress' had been specified. All other characters are treated as if 'retain' had been specified. etc. suppress The glyph area generated by the fo:character is eligible to be suppressed at the start or end of a line-area depending on the white-space-treatement property. (q.v.) retain The glyph area generated by the fo:character shall be placed in the area tree whether or not it first or last in a line-area. ================================================ Paul Grosso for the XSL FO Subgroup of the XSL WG
Received on Thursday, 7 August 2003 11:47:35 UTC