W3C home > Mailing lists > Public > public-html@w3.org > November 2008

Re: An HTML language specification vs. a browser specification

From: Boris Zbarsky <bzbarsky@MIT.EDU>
Date: Fri, 14 Nov 2008 15:40:04 -0500
Message-ID: <491DE224.4070302@mit.edu>
To: Robert J Burns <rob@robburns.com>
CC: public-html@w3.org

Robert J Burns wrote:
> 1) a markup parsing and serialization specification  with thoroughly 
> specified error handling  that could apply as much to SGML (if DTD 
> support was added back into it) as it applies to HTML

So here's what I don't understand.  What do we mean by "parsing" in this 
context?  Typically a parser either constructs some data structure 
directly or provides a series of callbacks, right?

So would this specification specify what the callbacks are for this markup:

   <div>
     <table>
       <tr>
         <span>text</span>
       <tr>
     <table>
   </div>

?  I would hope so.  If it does, how is that different from the existing 
parsing specification?  Note that if I replace that <span> by <td> I 
would expect different behavior from the parser, so the parsing 
specification needs to be aware of specific elements of the language and 
how they behave while parsing (heck, that's true for HTML4, what with 
implied tags in some cases, etc).

> 2) modified HTML language and DOM specification

Of course the parsing specification depends on the former...  It's 
possible to describe the parser in SAX-like terms without talking about 
a DOM, I guess.  I'm not sure the added complexity of description is 
necessarily warranted, though: the sequences of callbacks parsing HTML 
with error handling produces is more complex than SAX.

> 3) a web browser behavior specification (as Roy called it) including the 
> thorough specification of DOM method and attribute processing algorithms

Note that in practice parsing might need to depend on attribute values....

None of this even starts to touch the impact that script, if it's being 
executed, has on parsing, of course.  That would need to be covered too, 
and I'm not sure which of your three parts you envision handling that.

-Boris
Received on Friday, 14 November 2008 20:40:51 UTC

This archive was generated by hypermail 2.3.1 : Monday, 29 September 2014 09:38:59 UTC