- From: Henri Sivonen <hsivonen@iki.fi>
- Date: Fri, 21 Nov 2008 12:08:35 +0200
- To: Rob Sayre <rsayre@mozilla.com>
- Cc: public-html@w3.org
On Nov 21, 2008, at 11:16, Rob Sayre wrote: > Henri Sivonen wrote: >> How would Mozilla work have benefited from the parsing algorithm >> being in a different document? > > I was thinking of the frequent request for DOMParser to handle text/ > html. For this use, you probably don't want scripts executing, but > you probably don't want <noscript> parsing either. This desired tree > output is similar to server side uses I've observed. Yeah, it would be useful to specify how the scripting state works with DOMParser. Currently, the XHR2 spec says: "If final MIME type is text/html let document be an object implementing the Document interface that represents the response entity body parsed following the rules set forth in the HTML specification for an HTML parser with scripting disabled and then terminate this algorithm. [HTML5]" http://dev.w3.org/2006/webapi/XMLHttpRequest-2/ I would guess that if that behavior would be wrong for the use case of DOMParser, it would be wrong for the use cases of XHR, too. I think the XHR2 spec or a spec for DOMParser could say "let document be an object implementing the Document interface that represents the response entity body parsed following the rules set forth in the HTML specification for an HTML parser with scripting enabled but without executing scripts and then terminate this algorithm" without any HTML 5 spec refactoring. > The Mozilla work would have benefited from a clear, complete, and > finished document on HTML parsing and tokenization. I don't see why > this document needs to be tied to a SQL API. > > Really, it's the publication schedules and revisions that are > interesting, not the division of the document. I think this is > obvious though, and I find all of the word games about "separate > documents" to be quite counterproductive. The WHATWG copy of the spec already has low-bureaucracy maturity indicators for sections. Managing different maturity levels of different parts of what is now a monolithic spec in the W3C/IETF way adds bureaucracy and causes artificial problems when seeking to do honest normative cross-referencing. I think that it's quite possible that the way the W3C and IETF manage spec maturity levels is less practical for speccing the interoperable browser platform than the way many countries manage their legal code (constantly patching a bit book with insanely complex cross-references including mutual referencing between various titles/acts). The main problems with the W3C/IETF model are: 1) In order for a more mature spec to reference a mature section inside an otherwise less mature spec, the latter needs to be split causing more bureaucracy. 2) Circular references lock specs into advancing together in maturity levels. I'm not suggesting that all the specs that cover pieces of the interoperable browser platform should be folded into one huge spec, but I am inclined to think that the bureaucracy flowing from the maturity rules related to normative cross references isn't productive and doesn't necessarily serve its purpose. (If SVG references CSS2 instead of CSS2.1, who benefits from readers having to have the tacit knowledge that everyone is supposed to go read CSS 2.1 instead of CSS2?) As for the maturity of the parsing section specifically, I think it cannot mature more from where it is now before we get browser builds with an implementation of the current draft to testers. >>>> ... It's not horribly intertwined but there are some >>>> dependencies ... >>> I agree. That's why I don't think splitting parsing *and* >>> vocabulary into a separate document is unreasonable on its face. >> >> I don't find it unreasonable on its face. (For MathML and SVG >> elements, text/html parsing and the vocabulary are already in >> separate documents.) However, I think here we should allow the >> person who does the work use the spec organization that suits his >> work pattern, because having the parsing and vocabulary in the same >> document isn't unreasonable on its face, either. > > Isn't this whole thread an uproar about someone else doing some work? The "language spec" uproar is partly about splitting away something that *is* strongly connected with the parts that it left out. > What if someone proposed taking some of these not horribly > interwined sections and putting them in a separate document (and > doing the work)? Is that heretical? No, it's not: http://lists.w3.org/Archives/Public/public-html/2008Oct/0127.html -- Henri Sivonen hsivonen@iki.fi http://hsivonen.iki.fi/
Received on Friday, 21 November 2008 10:09:17 UTC