W3C home > Mailing lists > Public > public-html@w3.org > November 2008

Re: An HTML language specification vs. a browser specification

From: Roy T. Fielding <fielding@gbiv.com>
Date: Mon, 17 Nov 2008 19:11:45 -0800
Message-Id: <52CA5E79-3886-4C56-8FD3-49B81615E0E4@gbiv.com>
Cc: HTML WG <public-html@w3.org>
To: Ian Hickson <ian@hixie.ch>

On Nov 17, 2008, at 4:45 PM, Ian Hickson wrote:
> Your e-mail seemed to imply a set of fundamental assumptions that I  
> am not
> sure we share. In order to help get a better common understanding, I'd
> like to see if you can explain to me whether I am correct that you  
> have
> those assumptions and if so, why you hold them to be true.
>
> 1. Browsers, in particular HTML parsers in browsers, change with
>    regularity in ways that change the rendering of existing pages.

I made no such claim.  I said that "Most content is written
programatically or for tools that existed in the distant past
(none of my content, for example, has ever been written by testing
what works in current browsers even back in the days when current
actually meant something)."

That was in direct contradiction to Jonas' claim that "current HTML
has been written for browsers and by testing what works in current
browsers."

> 2. The vast majority of "non-program" content was written long before
>    MSIE6 existed (2001).
>
> My understanding is that content on the Web has been growing  
> exponentially
> year over year, which makes this assumption seem implausible. Could  
> you
> provide us with data that backs up your assertion?
>
> (Data that backs up the opposite assertion would be, for example, that
> search engines around 2000 knew of about a billion pages, whereas  
> search
> engines today know of about a trillion pages, suggesting that the  
> majority
> of content is newer than 2000. [1])
>
> [1] http://googleblog.blogspot.com/2008/07/we-knew-web-was-big.html

Written programmatically includes everything rendered by blogging
software, content management systems, Google, Yahoo, Facebook,
YouTube, Word's "save as HTML", and regurgitated by sites that
transclude other sites.

The billion or so pages of information growth is almost entirely
placed into HTML form by authoring tools that do not give authors
control over the HTML form.  Those tools are developed based on
language specifications and generic templates, not by "designing
for current browsers" which didn't exist at the time and not by
"testing what works in current browsers" that have a lifetime far
shorter than the information produced by authoring tools.

> 3. Authors when writing Web pages do not attempt to make their  
> pages look
>    like they want in the browser they use.
>
> Based on the feedback one sees in authoring community discussion  
> groups,
> it appears that authors do in fact check that their new content  
> renders as
> they desire in contemporary browsers. If you disagree, could you
> demonstrate why you believe this is not the case?

Of course they do.  They also think that adding an entry using
Wordpress is authoring HTML.  What's your point?

> 4. A browser that doesn't implement the APIs, vocabulary, and error
>    handling that major browsers implement could effectively compete  
> in the
>    marketspace.
>
> Could you provide an example of a competitive browser that doesn't
> implement, as you put it, "all that crap"?

No.  The Web would be better off without that crap, but I have
no objection to you putting all that crap into a browser spec
if you think all browsers need to implement it.  That is in contrast
to HTML, the language, which is something that my software does
generate and needs to remain compliant with, and thus it does
cause a great deal of harm for you to add a bunch of procedural
nonsense to the declarative language definition.

> 5. A specification that defines how to implement a Web browser would
>    remove competition in the browser space.
>
> Reports from browser vendors suggest that a considerable amount of  
> time is
> spent reverse-engineering other browsers in order to be  
> competitive. HTML5
> attempts to reduce this by doing all this work for them, thus  
> reducing the
> amount of work that it would take to make a competitive browser.
>
> Why do you think that defining these features in detail reduces the
> ability for new competitors to enter the market?

Because defining error behavior as the standard makes it very
difficult for applications that are error-free to be approved
for use within the environments that require adherence to standards
(including the stupid ones).

> 6. Most people don't want a specification that covers the features  
> that
>    HTML5 covers.
>
> I understand that you might not want it, but what evidence do you have
> that the majority of the Web standards community doesn't want it?

Because not a single expert in the Web standards community that
I have talked to in the past two years has supported the current
work in HTML5.  The single most common reaction to the features
that you have wedged into HTML5 is abject laughter and disdain
for this process.

> (There is counter-evidence, for example the size of the HTML and  
> WHATWG
> working groups and the level of support that HTML5 has had in working
> group votes.)

You aren't listening to the objections.  Nothing I have said here
hasn't been repeated several times already and reinforced by a
dozen others, and yet you have not made a single change to the
document that represents those WG opinions.

> 7. Only browsers need to deal with error handling in parsing.
>
> Why do you think that, for example, search engines, validators,  
> authoring
> tools, data mining tools, and so forth, would benefit from _not_  
> handling
> errors in HTML documents in the same way as browsers do?

They all handle errors in different ways.  It would be utterly
stupid for an authoring tool to generate error-filled HTML just
because the browser spec says that it must auto-adjust the DOM
(which it doesn't even have) rather than simply fix the HTML or
print an error. Error handling in entirely dependent on context.

> 8. Firefox is getting buggier with every release.
>
> Could you provide examples showing that Firefox is regressing?

I didn't say that, but my personal experience of watching the process
with top and one crash per day with Firefox 3.0.3 has not been happy.
The 3.0.4 experience has been much better so far in terms of  
reliability,
but a resident memory size of 141M is ridiculous. YMMV.

> 9. Firefox market share has flatlined.
>
> Could you provide evidence showing that the market share of Firefox  
> has
> stopped increasing? The aggregate data at Wikipedia's "Usage share  
> of web
> browsers" page shows continuing growth. How is this data wrong?

It isn't wrong.  Usage share of wikipedia doesn't match the authoring
share of my customers (for whom Firefox was more prevalent two years
ago than it is today).  I don't wish that to be the case.

> 10. HTML5 will stop innovation.
>
> Why would HTML5 stop innovation? Why did HTML4+DOM2HTML not stop
> innovation? What is the difference?

I didn't say it would stop innovation. HTML5 defines HTML as a
behavioral process within a DOM-implemented browser.  It trashes
the successful work on HTML standards in order to satisfy the whims
of a few browser implementors that have never implemented the
standards as they existed.  The fact that those few vendors who
had been consistently ignoring standards for the Web would take it
upon themselves to redefine what HTML means rather than implement
it correctly is not a sign of good.  It is just more of the same.

The difference is that I cannot implement HTML5 as it has been
specified, nor would I care to do so, and will object to its
publication for as long as that is the case.

> 11. Many of HTML5's features have no demand.
>
> Why do you think that features like Web Sockets, Workers, SQL storage,
> etc, have no basis for implementation? I am especially curious  
> about this
> claim since for all of these features I have heard huge amounts of  
> demand
> from authors, as well as reports of demand from Web browser vendors.

Because you don't listen to objections.

> 12. It is more important that authoring tools implement HTML correctly
>     than browser vendors implement HTML correctly.
>
> Could you explain this position? It would seem to me that it is  
> equally
> important that both implement it correctly. The point of  
> specifications is
> interoperability, one doesn't get that if only half of the tools  
> implement
> the specification.

Data lasts longer than processes.

> 13. None of the features in HTML5 are going to be implemented by  
> authoring
>     tools.
>
> Could you explain why you think this? It seems that features like
> <section>, <article>, <details>, etc, the new form controls, the menu
> widgets, contentEditable, drag and drop, etc, would all be things that
> authoring tools would want to expose to their users, so that they can
> create more semantic pages and more interactive pages.

Most of those are mark-up features (or at least could be defined as  
such).
I said "ping, SQL-storage, websockets, workers, ... the list goes on."
Those features have no basis for being in HTML.

> I look forward to your explaining these assumptions (or correcting  
> me if
> you do not in fact hold these assumptions to be true).
>
>
> On Mon, 17 Nov 2008, Roy T. Fielding wrote:
>>
>> HTML is a mark-up language
>
> HTML5 is not, and has never been intended to be, limited to a markup
> language. It is an application and document publishing platform.  
> Indeed
> the specification used to be called "Web Applications 1.0"; it was  
> renamed
> to "HTML5" after a W3C working group vote.

So, change the title back.  It wouldn't be the first time that a W3C
working group held a vote without having a clue about the consequences.

>> yet HTML5's scope appears to be [...]
>
> The document describes the scope. You may disagree with it, but it
> certainly hasn't been a secret -- heck, it's even in our charter,  
> and has
> been supported by vote multiple times.
>
>
>> You aren't defining a mark-up language, so stop calling this effort
>> HTML.
>
> I can't change the name, it was agreed to by the working group  
> through a
> vote (that I abstained from!). If you would like the name changed  
> back to
> its original name, Web Applications 1.0, I would be happy to ename the
> spec, but you'll have to convince the working group chairs.

I am trying to convince the WG.

>> It just ends up confusing the folks who work on the Web architecture
>> (the protocols for communicating between independent  
>> implementations).
>
> Can you provide links showing this confusion?

This thread is enough of a link to document that.

>> The architecture is what must work across all implementations, not  
>> just
>> browsers.
>
> I agree, that's why this isn't a browser specification.

Then it shouldn't be specified in a way that only a browser
can comply with.

....Roy
Received on Tuesday, 18 November 2008 03:12:09 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 9 May 2012 00:16:24 GMT