compatMode and the HTML parsing algorithm from David Flanagan on 2011-10-04 (www-dom@w3.org from October to December 2011)

From: David Flanagan <dflanagan@mozilla.com>
Date: Mon, 03 Oct 2011 17:04:10 -0700
To: www-dom@w3.org
Message-ID: <4E8A4D7A.9070508@mozilla.com>

The HTML parsing algorithm has steps that require one to "set the 
Document to quirks-mode" or "set the Document to limited quirks-mode". 
The DOM doesn't define any API for doing that, but does define the 
Document.compatMode attribute which depends on those settings having 
been made.  As far as I can tell, this means that it is not possible to 
implement a conforming HTML parser unless you're also implementing the 
DOM itself.  I can't, for example, write an HTML parser in JavaScript 
that builds a tree using the native DOM provided by a browser, since I 
can't get the correct behavior for compatMode.

Note that I cannot just expect 
document.implementation.createDocumentType() to do the right thing based 
on the doctype name, publicid and systemid.  The parsing algorithm also 
sometimes forces a document into quirks mode based on syntax errors in 
the <!DOCTYPE> tag, but these syntax errors aren't visible once the 
tree-building stage of the parsing algorithm begins.

I propose, therefore, that DOM4 add a 4th argument to 
createDocumentType().  If true, then the document associated with that 
document type would be in quirks mode.  If false, or omitted, then the 
document is in no-quirks mode or limited-quirks mode.  Alternatively, 
and for more future flexibility, this 4th argument could be an optional 
string that becomes the value of compatMode.

     David

Received on Tuesday, 4 October 2011 00:04:39 UTC