- From: Jonas Sicking <jonas@sicking.cc>
- Date: Mon, 3 Oct 2011 17:14:27 -0700
- To: David Flanagan <dflanagan@mozilla.com>
- Cc: www-dom@w3.org
On Mon, Oct 3, 2011 at 5:04 PM, David Flanagan <dflanagan@mozilla.com> wrote: > The HTML parsing algorithm has steps that require one to "set the Document > to quirks-mode" or "set the Document to limited quirks-mode". The DOM > doesn't define any API for doing that, but does define the > Document.compatMode attribute which depends on those settings having been > made. As far as I can tell, this means that it is not possible to implement > a conforming HTML parser unless you're also implementing the DOM itself. I > can't, for example, write an HTML parser in JavaScript that builds a tree > using the native DOM provided by a browser, since I can't get the correct > behavior for compatMode. > > Note that I cannot just expect document.implementation.createDocumentType() > to do the right thing based on the doctype name, publicid and systemid. The > parsing algorithm also sometimes forces a document into quirks mode based on > syntax errors in the <!DOCTYPE> tag, but these syntax errors aren't visible > once the tree-building stage of the parsing algorithm begins. > > I propose, therefore, that DOM4 add a 4th argument to createDocumentType(). > If true, then the document associated with that document type would be in > quirks mode. If false, or omitted, then the document is in no-quirks mode > or limited-quirks mode. Alternatively, and for more future flexibility, > this 4th argument could be an optional string that becomes the value of > compatMode. I don't really like the idea of making it possible for pages to create Documents which are more "quriky" modes than absolutely needed. Is anyone actually planning on implementing a HTML parser where they don't control the DOM? I know that in the Gecko HTML parser we use a lot of internal functions in order to improve performance. / Jonas
Received on Tuesday, 4 October 2011 00:15:24 UTC