W3C home > Mailing lists > Public > public-html@w3.org > July 2007

Re: CR and LF in the input stream / as NCRs (detailed review of parsing algorithm)

From: Simon Pieters <simonp@opera.com>
Date: Tue, 31 Jul 2007 13:28:17 +0200
To: "Michael A. Puls II" <shadow2531@gmail.com>
Cc: public-html <public-html@w3.org>
Message-ID: <op.twbt5ft3idj3kv@hp-a0a83fcd39d2>

On Tue, 31 Jul 2007 04:22:26 +0200, Michael A. Puls II  
<shadow2531@gmail.com> wrote:

> On 7/30/07, Simon Pieters <simonp@opera.com> wrote:
>>
>> (This is part of my detailed review of the parsing algorithm.)
>>
>> In http://www.whatwg.org/specs/web-apps/current-work/#consume the spec
>> states that &#13; is a parse error. Is this intentional?
>>
>>
>> The handling of &#10;, &#13;, CRs and LFs, and their combinations, seems
>> to be a bit different in browsers.
>>
>>
>> http://simon.html5.org/test/html/parsing/tokenisation/entities/carriage-return/demo.htm
>
> To add, see http://shadow2531.com/opera/testcases/cdata/002.html
>
> In that situation, what Firefox and Safari give is correct IMO. IE and
> Opera are way off.

Aha. I didn't think of testing attributes.

    http://simon.html5.org/test/html/parsing/tokenisation/entities/carriage-return/demo-attribute.htm
    http://simon.html5.org/test/html/parsing/tokenisation/entities/carriage-return/demo-attribute2.htm

Safari preserves CRs in attribute values, both real and NCRs. CRLF pairs,  
LFCR pairs, CRs and LFs cause a single linebreak in the tooltip. In data,  
CRs don't cause linebreaks.

For title="", IE preserves CRs in attribute values, both real and NCRs.  
CRLF pairs, CRs and LFs in the DOM gets rendered as a signle linebreak in  
the tooltip. For value="", all types of linebreaks are converted to CRLF  
pairs. In data, only CRs cause linebreaks and LFs are rendered as spaces.

Firefox preserves CRs in attribute values, both real and NCRs. CRs are  
ignored and LFs are rendered as spaces in the tooltip. In data, CRs don't  
cause linebreaks.

For title="", Opera drops LFs in attribute values, both real and NCRs, and  
converts CRs (both real and NCRs) to spaces. For value="", CRs and LFs are  
preserved as written, both real and NCRs.


Personally, I think attribute values should be parsed the same way as data  
is parsed wrt linebreaks.

-- 
Simon Pieters
Opera Software
Received on Tuesday, 31 July 2007 11:28:57 UTC

This archive was generated by hypermail 2.3.1 : Monday, 29 September 2014 09:38:47 UTC