Re: compatMode and the HTML parsing algorithm

On 10/3/11 5:31 PM, Ian Hickson wrote:
> On Mon, 3 Oct 2011, David Flanagan wrote:
>> I propose, therefore, that DOM4 add a 4th argument to 
>> createDocumentType().
>> If true, then the document associated with that document type would 
>> be in
>> quirks mode.
> What's the use case?
>
> "Implementing an HTML parser in JavaScript in a browser" isn't a valid 
> use
> case since browsers already expose HTML parsers, just like they expose
> DOMs. If you want to reimplement one of these, you need to reimplement 
> the
> other, because it's all part of implementing a browser.
First, I have to retract my specific proposal.  Documents can be in 
quirks mode without having a doctype node, so adding a new argument to 
createDocumentType() wouldn't be sufficient.  There would have to be 
some sort of attribute or method on the Document itself.  Presumably 
making compatMode read/write is not the right thing to do. Maybe a 
method that allows the quirks mode to be set once, but never changed?

My argument for fixing this is mainly an esthetic one.  This is the only 
place in the HTML parsing algorithm that I am aware of where the parser 
manipulates the document in some way that cannot be done with the public 
API. If this is indeed the only such place, then that strikes me as lack 
of coordination between the HTML and DOM specs, and that some minor 
tweaks should be made to the two specs so that the parser can be 
specified purely in terms of the public document API. (Perhaps as I work 
my way through the algorithm in more detail I'll find a number of other 
such private document manipulations and that will ruin this argument.)
One use case allowing HTML parsers in JS would be creating parser shims 
for old browsers that do not parse HTML correctly.  Or perhaps for 
writing utilities like 
http://software.hixie.ch/utilities/js/live-dom-viewer/ that compare a 
browser's parse tree with the parse tree generated by a compliant parser.

     David

Received on Tuesday, 4 October 2011 17:30:20 UTC