W3C home > Mailing lists > Public > whatwg@whatwg.org > June 2007

[whatwg] CR "entities" and LFCR

From: Michael A. Puls II <shadow2531@gmail.com>
Date: Thu, 7 Jun 2007 17:12:38 -0400
Message-ID: <6b9c91b20706071412y79e04dfavcaab018535116ba7@mail.gmail.com>
On 6/7/07, Anne van Kesteren <annevk at opera.com> wrote:
> These should be converted to LF too. One thing that might be interesting
> to look into is the handling of LFCR in browsers (as opposed to CRLF). I
> haven't done that yet... Some browsers (just tested Opera) also normalize
> two newline entities following each other (CRLF pair).

Not sure if it'll help, but whenever I do newline normalization to LF, I:

Convert all CR + LF pairs to LF.
Then, I convert any CRs left over to LF.

Examples:

LF + CR + LF + CR -> LF + LF + LF.

CR + CR + LF -> LF + LF.

Anyway,

In the case of <!DOCTYPE
html><html><head><title></title></head><body><div>1&#10;&#13;2</div></body></html>

Opera produces LF + CR in the dom for the div nodeValue.

Firefox produces LF + LF (What I'd expect.)

IE6 produces a space. (If the div consists of only those 2 entities
(without the 1 and the 2), IE6 throws the newlines away and there will
be no childNodes for the div.)

FF's way seems right IMO.

-- 
Michael
Received on Thursday, 7 June 2007 14:12:38 UTC

This archive was generated by hypermail 2.4.0 : Wednesday, 22 January 2020 16:58:56 UTC