W3C home > Mailing lists > Public > public-html@w3.org > November 2008

Re: An HTML language specification vs. a browser specification

From: Boris Zbarsky <bzbarsky@MIT.EDU>
Date: Wed, 19 Nov 2008 21:36:14 -0500
Message-ID: <4924CD1E.8090108@mit.edu>
To: "Roy T. Fielding" <fielding@gbiv.com>
CC: HTML WG <public-html@w3.org>

Roy T. Fielding wrote:
> Likewise, this is a discussion about whether HTML should be defined
> using a declarative language specification or not.  I think it must
> be defined as a declarative language specification because that is
> what my tools need in order to understand and implement HTML.

I can respect that, though there are others who could probably make the 
opposite claim.  But see below.

> I am sure the wave of fanboys will start crying about my use
> of argument by authority, but quite frankly I don't care any more.
> Let them demonstrate by deploying implementations, not opinions.

I'm glad to know that you'd consider me a fanboy in this context and 
feel that I should demonstrate something by "deploying implementations". 
  Let me get back to you on that once we finish shipping Gecko 1.9.1 
(which just happens to include HTML generation tools in addition to that 
little web browser part), ok?

>> But why is this a good thing?
> 
> Because they are different contexts.  The fact that I want my
> authoring tool to spellcheck my content does not imply that I want
> all browsers to display squiggles under every word not found in
> their own dictionaries.

At the same time, you want your browser and your authoring tool to agree 
on some things.  In particular, if you expect your authoring tool to 
import HTML someone else has created, it needs to be able to parse it in 
a reasonable way.  If you have no desire for such a feature, of course, 
you might not even need an HTML parser (e.g. your tool would store 
documents in some format of your choice, with richer annotation 
capabilities than HTML, and export HTML on demand).

> The HTML5 spec can't be understood without an implemented DOM.

This part I think is false.  The HTML5 parsing algorithm can't be 
understood without a mental model of containment of objects representing 
HTML elements by other objects representing HTML elements.  That is, you 
have to know where the open and close tags go.  Whether this mental 
model is a DOM, a stack of open/close pairs, or something else isn't 
actually that important, as far as I can tell.  You do have to be 
willing to dig arbitrarily far back in the representation (down the 
stack, up the tree in the DOM, whatever) in certain error-handling 
situations.

Perhaps I'm biased in the sense that I do spend a lot of my time dealing 
with a DOM, but it doesn't seem very difficult to just do a line-by-line 
translation into any other reasonable model to me.

That's assuming that you're talking about the parsing spec, of course. 
If you mean something else, I'd love it if you elaborate here.

> Moreover, the specification of
> HTML should be in terms of a declarative language that is produced
> by generators and consumed by browsers

Hold on.  Are we talking about the specification of HTML parsing, or are 
we talking about the specification of what HTML documents are valid?  Or 
both?

> Finally, the parts of the spec that have nothing to do with HTML,
> such as SQL storage for web applications, should be kicked out.

I think there's near-universal agreement (including from Ian) that this 
would be a good idea (for SQL storage  and a whole bunch of other stuff) 
as soon as someone actually steps up to edit those parts.

> The rationale for all of that is because HTML is a declarative
> language that has been designed to be portable across a very wide
> range of platforms and accessibility constraints

Sort of.  An interesting experiment is browsing the web with script 
disabled.  About 70% of it works sort of ok.  The other 30% is clearly 
not using declarative HTML...  I'm not saying that's a good thing, just 
a fact of life.

Note that those numbers are for me personally, and are weighted by use 
frequency, not number of pages (so if I visit my bank's website 3 times 
a week and view 7 other pages that week, and my bank requires JS to 
work, that gives a 70-30 split as above).  Your numbers may vary, of course.

-Boris

P.S.  I personally happen to agree with you on ping, by the way.
Received on Thursday, 20 November 2008 02:37:00 UTC

This archive was generated by hypermail 2.3.1 : Monday, 29 September 2014 09:38:59 UTC