Re: [CSS21][CSS3 Text] Re: Treating carriage return as white space in layout

On 07/08/2010 12:53 PM, Henri Sivonen wrote:
> On Jul 7, 2010, at 14:27, fantasai wrote:
>
>> CSS3 Text seems pretty clear to me:
>>   # In the context of CSS, the document white space set is defined to be any
>>   # space characters (Unicode value U+0020), tab characters (U+0009), or line
>>   # break characters (defined by the document format: typically line feed,
>>   # U+000A). Control characters besides the white space characters and the
>>   # bidi formatting characters (U+202x) are treated as normal characters and
>>   # rendered according to the same rules.
>
> This part alone would be clear...
>
>>   # The document parser must normalize line break character sequences according
>>   # to its own format rules before CSS processing takes effect.
>
> ...but this part steps outside the jurisdiction of CSS (or is at least a weird
> use of 'must' to reinforce whatever 'its own format rules' say)...

I think it's reasonable for CSS to say whether CSS white space collapsing
takes effect before or after non-CSS processing.

>> However, in
>>   # generated content strings the line feed character (U+000A) and only the line
>>   # feed character is considered a line break sequence. For CSS white space
>>   # processing all line breaks must be normalized to a single character
>>   # representation—usually the line feed character (U+000A)—here called a
>>   # "line break".
>
> ...and this part introduces doubt about whether the first paragraph was actually clear.

This part basically says what the equivalent of document-language normalization
rules are for generated content. I don't see how this is inconsistent with
the first paragraph.

>> According to CSS3 Text, carriage returns are not white space characters.
>> They therefore do not get any special treatment during the white space
>> collapsing process and are treated the same as any other non-whitespace
>> control character.
>>
>> Both CSS3 Text (quoted above) and CSS2.1 (section 16.6.3) say that carriage
>> returns are treated as characters to render the same as normal characters:
>> they do not behave as control characters. I assume this means that if
>> there's a glyph in the font they are rendered as that glyph, otherwise some
>> substitution process is triggered just as for missing glyphs of other
>> characters. If that's not what we want for control characters, and what we
>> want is for the character to definitely disappear, or to definitely fall
>> back to nothing, then we'll need to adjust both specs to say so.
>
> I think that's not what we want. I'm suggesting that CR be treated as whitespace.

"treated as whitespace" is vague. Different kinds of whitespace are treated
differently.

>> The only thing I see missing in CSS3 Text is a statement that characters
>> designated as line breaks cause forced line breaks, which is pretty obvious,
>> but should be stated clearly somewhere. :)
>>
>> Is the behavior specced in CSS3 Text what you want, and would backporting
>> some changes to CSS2.1 to create the same effect solve the problem, or is
>> there something else you needed here?
>
> For white-space-collapse: collapse;, I need CR to collapse.

To collapse to what? Should it be treated as a space/tab or as a line break
during collapsing? (In Latin these both collapse to space, but not in other
scripts.)

> For white-space-collapse: preserve-breaks;, I'm not totally confident what's
> best, but I've been persuaded that Opera 10.60's behavior (CR is a break but
> it coalesces with LF when appearing in a CRLF pair) is the thing I should
> be wanting.

So you want CRLF normalization to happen at the CSS level in addition to the
source markup level for text appearing in the DOM, but not for text in
generated content?

~fantasai

Received on Monday, 2 August 2010 20:24:00 UTC