W3C home > Mailing lists > Public > www-style@w3.org > June 2008

Re: Whitespace

From: L. David Baron <dbaron@dbaron.org>
Date: Tue, 10 Jun 2008 14:03:00 -0700
To: Ian Hickson <ian@hixie.ch>, "Linss, Peter" <peter.linss@hp.com>
Cc: www-style@w3.org, Alex Mogilevsky <alexmog@exchange.microsoft.com>, Justin Rogers <justrog@microsoft.com>
Message-ID: <20080610210300.GA27649@pickering.dbaron.org>

On Tuesday 2008-06-10 18:54 +0000, Ian Hickson wrote:
> For consistency in the Web platform I would like us to make the whitespace 
> definitions for HTML5 and CSS match. Right now, HTML5 defines the 
> following characters to be syntactic whitespace:
> 
>    U+0020 SPACE, U+0009 CHARACTER TABULATION (tab), U+000A LINE FEED (LF), 
>    U+000B LINE TABULATION, U+000C FORM FEED (FF), and U+000D CARRIAGE 
>    RETURN (CR)
>    http://www.whatwg.org/specs/web-apps/current-work/#space
> 
> CSS2.1 defines the following characters to be syntactic whitespace:
> 
>    "space" (U+0020), "tab" (U+0009), "line feed" (U+000A), "carriage 
>    return" (U+000D), and "form feed" (U+000C) 
> 
> The only difference appears to be the inclusion of U+000B in the 
> definition for HTML5.

So, I was going to propose a change yesterday, but not a change this
big.  I was just going to propose changing the definition of
whitespace for ~= selectors (and class selectors) to match HTML5,
since those selectors are intended to match HTML.

But now I've reconsidered.  There's a *lot* of data in:
  https://bugzilla.mozilla.org/show_bug.cgi?id=437915

I'm strongly opposed to changing the CSS definition of whitespace
that's been stable for ten years and is reliably implemented across
browsers.  See Gecko and Webkit behavior on:
https://bugzilla.mozilla.org/attachment.cgi?id=324389
https://bugzilla.mozilla.org/attachment.cgi?id=324515

> HTML5's definition has a couple of minor advantages: it seems to be 
> closers to what IE7 does (at least for HTML), and it allows spaces to be 
> defined as the range of characters from U+0009 to U+000D plus U+0020, 
> rather than having it be five separate codepoints, which may allow for 
> some subtle optimisations.

IE7's behavior is so wacky that it's nearly impossible to tell what
it does in CSS, since its CSS parser recovers from errors very
aggressively.

> Would adding U+000B to the CSS white space definition be acceptable to the 
> CSSWG, or are there good reasons to exclude U+000B that should cause me to 
> remove it from the HTML5 definition?

I think it should just be removed from the HTML5 definition.

On Tuesday 2008-06-10 20:30 +0000, Linss, Peter wrote:
> FWIW Gecko accepts U+000B as whitespace (and likely has since the
> beginning).

No it doesn't.  See results on:
https://bugzilla.mozilla.org/attachment.cgi?id=324389
https://bugzilla.mozilla.org/attachment.cgi?id=324515

-David

-- 
L. David Baron                                 http://dbaron.org/
Mozilla Corporation                       http://www.mozilla.com/
Received on Tuesday, 10 June 2008 21:03:54 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 27 April 2009 13:55:07 GMT