W3C home > Mailing lists > Public > www-style@w3.org > December 2013

[css-text] Confused about the White Space Processing Details in CSS Text Module Level 3

From: Jon Ronnenberg <jon.ronnenberg@gmail.com>
Date: Wed, 4 Dec 2013 09:19:14 +0100
Message-ID: <CAPEZGVt3DW4XTCCFuXKg8YZfkYsPcjcymW+_95jGbNJCNwncvA@mail.gmail.com>
To: "www-style@w3.org" <www-style@w3.org>
Hi

I'm the author of white-space[1] and is trying to comprehend the changes in
the latest CSS Text Module Level 3 draft[2]. Especially part 4, about white
space processing. In the last draft[3] and on the www-style list, it's been
noted that web authors, myself included, would like to specify that white
space between HTML elements are discarded.

In the latest draft[2], the noted issue is gone, but as far as I can tell,
the spec now states that white space should be removed between HTML
elements.

Take the following example:
<ul><!-- segment break -->
  <li style="display:inline-block;">foo</li><!-- segment break -->
  <li style="display:inline-block;">bar</li><!-- segment break -->
</ul><!-- segment break -->
In the above, each <li> element will be rendered with a white space
between. In the current specs[2], a line-feed or carriage return is a
segment break. A segment break is subject to the CSS white-space rule. Now
according to 4.1.1 step 1, all spaces before and after the segment break is
removed. E.i, the following is the same:
<ul><!-- segment break -->
  <li style="display:inline-block;">foo</li>    <!-- segment break -->
<li style="display:inline-block;">bar</li><!-- segment break -->
</ul><!-- segment break -->

In step 2 the segment breaks are converted, according to 4.1.2, to either a
space or removed. Our rendered markup should now resemble the following:
<ul><!-- segment break -->
<li style="display:inline-block;">foo</li><!-- segment break -->
<li style="display:inline-block;">bar</li><!-- segment break -->
</ul><!-- segment break -->
"If the character immediately before or immediately after the segment break
is the zero-width space character (U+200B), then the break is removed,
leaving behind the zero-width space."[4].
I would think that we don't have a zero-width space character but nothing,
as spaces was removed in step 1 of 4.1.1. So if we skip the east Asian
part, we get to "Otherwise, the segment break is converted to a space
(U+0020)." Which reinstate a white space.


*My questions are:*

   1. How do we, according to spec, retain a zero-width space character
   near the segment break, to get the desired result?
   2. Removing and reinstating white space seems buggy to me. Is there an
   error in the specs?

I understand the reason to have soft wrap opportunities and my questions is
by no means a suggestion to get rid of them. I'm only concerned about
converting segment breaks to white space between HTML elements. I think the
specs are aiming at content text.
E.g:

<p>
Lorem ipsum dolor sit amet, consectetur<!-- segment break -->
adipisicing elit, sed do eiusmod tempor<!-- segment break -->
incididunt ut labore et dolore magna aliqua.<!-- segment break -->
</p>

The above should reinstate white space between "consectetur" and
"adipisicing" etc.


Cheers, Jon

[1] https://github.com/dotnetCarpenter/white-space
[2] http://www.w3.org/TR/css3-text/
[3] http://www.w3.org/TR/2012/WD-css3-text-20121113/#white-space
[4] http://www.w3.org/TR/css3-text/#line-break-transform
Received on Wednesday, 4 December 2013 08:19:41 UTC

This archive was generated by hypermail 2.3.1 : Monday, 2 May 2016 14:39:17 UTC