Re: [css-text] I18N-ISSUE-337: line terminator handling

On 07/31/2014 03:27 PM, John C Klensin wrote:

> Two observations (not further complaints or justification for
> leaving this open unless others agree):
> (i) Unless there is general consensus that Unicode's attempt to
> introduce an unambiguous Line Separator in form of U+2028 has
> been a complete failure, I suppose the CSS document would be
> better off either including it as an additional alternative (to
> "... document languageā€“defined segment break, CRLF sequence
> (U+000D U+000A), carriage return (U+000D), and line feed
> (U+000A)...") or mentioning why it is not so included.

White space handling and forced line break handling in CSS
are two very distinct operations. U+2028 is not white space
in the CSS sense: it is not affected by collapsing, and is
never discarded as such.

However, it is respected as a forced line break character if
it happens to occur in the document stream.

See, specifically, the first bullet point in 5.1 Line Breaking Details

> (ii) I believe that the Unicode Standard discussion of "NLF"
> represents a better approach than the indifference ("does not
> define...") expressed in the CSS spec.  I.e., one should be
> permissive in what is accepted but should canonicalize all of
> them
> to a single preferred form.  But that obviously isn't the way
> the spec if going.

Not sure why you think it does not canonicalize line breaks.
The spec says
   # When white-space is pre, pre-wrap, or pre-line, segment
   # breaks are not collapsible and are instead transformed
   # into a preserved line feed (U+000A).

The spec does not define what a segment break is because that
depends on the document language. In HTML it is an LF, CR, or
CRLF sequence; but in SGML other options are possible. (CSS is
defined to style a document tree; it doesn't care if the
document tree is HTML, DocBook XML, or some other format.)


Received on Saturday, 2 August 2014 07:04:01 UTC