[whatwg] Space characters from Elliotte Harold on 2006-11-06 (public-whatwg-archive@w3.org from November 2006)

From: Elliotte Harold <elharo@metalab.unc.edu>
Date: Mon, 06 Nov 2006 08:31:00 -0500
Message-ID: <454F3914.7040103@metalab.unc.edu>

Henri Sivonen wrote:

> Would there be serious compatibility problems if the HTML5 parsing 
> algorithm required VT and FF to be mapped to space (after expanding 
> NCRs) and the higher-level parts of the spec defined white space as 
> space, tab, CR and LF?
> 

That seems a reasonable solution to me. I doubt anyone these days is 
heavily depending on VT and FF. Mostly it's just random leftover 
detritus from very old text files.

It's important to note that if you allow VT and FF as those characters 
in the HTML serialization then:

1. The document has no infoset.
2. The document cannot be serialized as well-formed XHTML.

Is it a requirement of the spec that all HTML 5 DOMs be able to be fully 
serialized as XHTML as well as HTML 5? If not, why not?

-- 
?Elliotte Rusty Harold  elharo at metalab.unc.edu
Java I/O 2nd Edition Just Published!
http://www.cafeaulait.org/books/javaio2/
http://www.amazon.com/exec/obidos/ISBN=0596527500/ref=nosim/cafeaulaitA/

Received on Monday, 6 November 2006 05:31:00 UTC