- From: Boris Zbarsky <bzbarsky@MIT.EDU>
- Date: Fri, 14 Nov 2008 15:40:04 -0500
- To: Robert J Burns <rob@robburns.com>
- CC: public-html@w3.org
Robert J Burns wrote: > 1) a markup parsing and serialization specification — with thoroughly > specified error handling — that could apply as much to SGML (if DTD > support was added back into it) as it applies to HTML So here's what I don't understand. What do we mean by "parsing" in this context? Typically a parser either constructs some data structure directly or provides a series of callbacks, right? So would this specification specify what the callbacks are for this markup: <div> <table> <tr> <span>text</span> <tr> <table> </div> ? I would hope so. If it does, how is that different from the existing parsing specification? Note that if I replace that <span> by <td> I would expect different behavior from the parser, so the parsing specification needs to be aware of specific elements of the language and how they behave while parsing (heck, that's true for HTML4, what with implied tags in some cases, etc). > 2) modified HTML language and DOM specification Of course the parsing specification depends on the former... It's possible to describe the parser in SAX-like terms without talking about a DOM, I guess. I'm not sure the added complexity of description is necessarily warranted, though: the sequences of callbacks parsing HTML with error handling produces is more complex than SAX. > 3) a web browser behavior specification (as Roy called it) including the > thorough specification of DOM method and attribute processing algorithms Note that in practice parsing might need to depend on attribute values.... None of this even starts to touch the impact that script, if it's being executed, has on parsing, of course. That would need to be covered too, and I'm not sure which of your three parts you envision handling that. -Boris
Received on Friday, 14 November 2008 20:40:51 UTC