W3C home > Mailing lists > Public > whatwg@whatwg.org > June 2007

[whatwg] CR "entities" and LFCR

From: Kristof Zelechovski <giecrilj@stegny.2a.pl>
Date: Fri, 8 Jun 2007 08:24:09 +0200
Message-ID: <000201c7a995$9fa60f70$1a01080a@POCZTOWIEC>
Reading a file in text mode ignores all carriage return control characters.
Stray carriage returns are ignored as well.  
I do not think Macintosh text files should be allowed on the Web without
encoding.
Chris

-----Original Message-----
From: whatwg-bounces@lists.whatwg.org
[mailto:whatwg-bounces at lists.whatwg.org] On Behalf Of Michel Fortin
Sent: Friday, June 08, 2007 3:19 AM
To: WHATWG List
Subject: Re: [whatwg] CR "entities" and LFCR

Le 2007-06-07 ? 17:12, Michael A. Puls II a ?crit :

> Not sure if it'll help, but whenever I do newline normalization to  
> LF, I:
>
> Convert all CR + LF pairs to LF.
> Then, I convert any CRs left over to LF.
>
> Examples:
>
> LF + CR + LF + CR -> LF + LF + LF.
>
> CR + CR + LF -> LF + LF.

I think that's the standard way of doing it. Quoting Markdown source  
code, and some Perl code found on Wikipedia [1]:

     s/(\r\n|\n|\r)/\n/g

it does exactly that.

  [1]: http://en.wikipedia.org/wiki/Newline#Conversion_utilities

Windows use CR+LF, UNIX uses LF, legacy Mac applications still use  
CR; but I'm not aware of any system using LF+CR (and there is none on  
Wikipedia) and I don't think it's useful to give a meaning to it.


Michel Fortin
michel.fortin at michelf.com
http://www.michelf.com/
Received on Thursday, 7 June 2007 23:24:09 UTC

This archive was generated by hypermail 2.4.0 : Wednesday, 22 January 2020 16:58:56 UTC