W3C home > Mailing lists > Public > public-html@w3.org > January 2010

Re: "vastly increases reverse-engineering costs" and version proposal

From: L. David Baron <dbaron@dbaron.org>
Date: Mon, 4 Jan 2010 10:57:12 -0500
To: public-html@w3.org
Message-ID: <20100104155712.GA18463@pickering.dbaron.org>
On Saturday 2010-01-02 13:45 -0800, Larry Masinter wrote:
> L. David Baron wrote, on December 3, 2009 11:49 PM
> Re: DOCTYPE versioning change proposal (ISSUE-4)
> http://lists.w3.org/Archives/Public/public-html/2009Dec/0126.html
> "as it vastly increases reverse-engineering costs for other implementations
> that seek to be compatible with that implementation", and with
> reference to 
> http://lists.w3.org/Archives/Public/public-html/2007Apr/0279.html  and
> http://lists.w3.org/Archives/Public/www-tag/2009Aug/0054.html  .
> Since "vastly increased reverse-engineering costs"
> have been cited elsewhere as motivation for some of
> the other decisions around HTML, I thought it was
> worth pushing on this a bit.
> * What are the motivations for reverse engineering of HTML engines? 
> It seems, though, that reverse engineering only applies
> when either
>   a) popular software does not follow already documented 
>     standards and practices. 
>   b) the documented standards and practices are insufficiently
>    precise to determine interoperability.
> On the other hand, if reverse engineering is caused by
> (b), then making the specification more precise and
> clarifying ambiguities would be helpful.

Making a single specification more precise only helps in so far as
that specification is relevant to the handling of documents.  Some
versioning proposals have argued that documents should be handled
based on the specification matching their DOCTYPE declaration; this
would mean that the clarifications of ambiguities would need to be
made in the HTML5, HTML4, HTML3.2, HTML2, and other HTML
specifications (including future ones) in order to be useful for
this purpose.  (In practice, changing the most precise one would
likely be enough to make implementations go with that behavior,
except for the cases where implementations actually use an entirely
different browser engine depending on the DOCTYPE declaration.)

> But in neither case is the introduction of a version
> indicator itself seem to have any effect.

It would have a significant effect in the following scenario:

 * HTML5 specification comes out, with a particular DOCTYPE
 * Microsoft writes an entirely new engine for Internet Explorer,
   but enables it for Web pages only with a particular DOCTYPE
   declaration (something that I think could have happened for IE8
   had HTML5 been done sooner)
 * Above 2 steps repeat a few additional times (with incrementing
   version numbers for both spec and implementation)
 * Internet Explorer gains >90% market share in at least some
   markets (this is currently true in a few countries, I think)

In this situation, any browser that wants to render the Web needs to
 * the pre-HTML5 version of Internet Explorer (which already has
   both quirks and standards mode)
 * the HTML5 version of Internet Explorer
 * the HTML6 version of Internet Explorer
 * etc.

> In general, though, the cost of reverse engineering
> existing HTML software's behavior in responding to 
> version indicators has already been paid; whether it
> was or wasn't "vast" in the past, is there any reason
> to believe that allowing an optional version indicator
> in HTML would add any additional "reverse engineering"
> costs at all?

You seem to be assuming that there will never again be a situation
in which one engine dominates the market, and then others want to
compete with it.  That situation has happened twice before (Netscape
in the mid-90's [1], and Internet Explorer in the
early-to-mid-2000's [2] and continuing to the present in some
countries).  I think the probability it could happen again is
substantial (even ignoring the countries in which it is still true

> In any case, I don't see how this argument applies to
> THIS actually change proposal. It's an interesting
> data point, a sore spot, a rationale for providing
> more precision than has been the norm in previous
> HTML specifications.

The argument in reference to this change proposal (quoting the first
URL cited in your message), was, in fact:
  # This part of the proposal makes me uncomfortable, because it assigns
  # one particular rationale ("as authors tend to use ...") to a SHOULD
  # NOT statement that I think there are much stronger arguments for.

> Yes, if version specific behavior were allowed or
> encouraged, that *might* increase the relevance of the
> "reverse engineering costs" argument, although
> even then the behavior of market-leading implementations
> against specifications is really out of the control
> of specification writers.

I agree that behavior against specifications is out of our control.
However, the question at hand is what the specification should say
about whether to follow itself or other specifications depending on
the contents of the DOCTYPE declaration.


[1] http://en.wikipedia.org/wiki/File:Netscape-navigator-usage-data.svg
[2] http://en.wikipedia.org/wiki/File:Internet-explorer-usage-data.svg
[3] http://blog.mozilla.com/gen/2008/09/29/987-internet-explorer-in-south-korea/

L. David Baron                                 http://dbaron.org/
Mozilla Corporation                       http://www.mozilla.com/
Received on Monday, 4 January 2010 15:57:41 UTC

This archive was generated by hypermail 2.3.1 : Thursday, 29 October 2015 10:15:56 UTC