- From: Simon Pieters <simonp@opera.com>
- Date: Tue, 31 Jul 2007 03:04:26 +0200
- To: public-html <public-html@w3.org>
(This is part of my detailed review of the parsing algorithm.)
In http://www.whatwg.org/specs/web-apps/current-work/#consume the spec
states that is a parse error. Is this intentional?
The handling of , , CRs and LFs, and their combinations, seems
to be a bit different in browsers.
http://simon.html5.org/test/html/parsing/tokenisation/entities/carriage-return/demo.htm
In Opera, CRs and LFs are preserved in the DOM as they were written. CR is
inserted for and LF for . A CRLF pair in the DOM is rendered as
a single linebreak.
In IE, CRLF pairs are converted to a single CR, and the remaining LFs are
converted to CRs. It doesn't matter they were from real characters in the
input stream or NCRs.
In Safari, a LF character in the input stream is ignored if the previous
character was a CR (whether real or NCR). CRs (both real and NCRs) are
then converted to LFs. LFs are inserted for both and .
In Firefox, CRLF pairs in the input stream is converted to LF and
remaining CR to LF. LFs are inserted for both and .
The spec currently matches Firefox, AFAICT. Rendering-wise, there is
interop between IE and Opera. I think the spec should require what IE
does, except use LFs instead of CRs.
--
Simon Pieters
Opera Software
Received on Tuesday, 31 July 2007 01:04:42 UTC