"vastly increases reverse-engineering costs" and version proposal from Larry Masinter on 2010-01-02 (public-html@w3.org from January 2010)

From: Larry Masinter <masinter@adobe.com>
Date: Sat, 2 Jan 2010 13:45:25 -0800
To: "public-html@w3.org" <public-html@w3.org>
Message-ID: <C68CB012D9182D408CED7B884F441D4D309097@nambxv01a.corp.adobe.com>
L. David Baron wrote, on December 3, 2009 11:49 PM
Re: DOCTYPE versioning change proposal (ISSUE-4)
http://lists.w3.org/Archives/Public/public-html/2009Dec/0126.html


"as it vastly increases reverse-engineering costs for other implementations
that seek to be compatible with that implementation", and with
reference to 
http://lists.w3.org/Archives/Public/public-html/2007Apr/0279.html  and
http://lists.w3.org/Archives/Public/www-tag/2009Aug/0054.html  .

Since "vastly increased reverse-engineering costs"
have been cited elsewhere as motivation for some of
the other decisions around HTML, I thought it was
worth pushing on this a bit.


* What are the motivations for reverse engineering of HTML engines? 

A search of the literature on reverse engineering
comes up with many reasons why reverse engineering might
be practiced, often around maintenance of legacy
software that was poorly documented. 
(See, for example, wikipedia “reverse engineering"
and the survey cited there). 

However, in the case of HTML, it seems that the
"reverse-engineering costs" is mainly used as a reference 
to the cost of determining what popular software (IE) did,
because other HTML engines wanted to be compatible
with the market leader.

It seems, though, that reverse engineering only applies
when either
  a) popular software does not follow already documented 
    standards and practices. 
  b) the documented standards and practices are insufficiently
   precise to determine interoperability.

I would claim that if there are market forces that
promote (a) (as happened at least in Browser Wars 1.0),
that little or nothing that is actually written in the
standard can matter. If software that is popular because
of marketing, tie-ins with the deployed operating systems
or mobile platforms, and there is a competitive advantage
to those platform vendors to support proprietary extensions
or platform-specific behavior (presumably to create
vendor lock-in by increasing reverse engineering costs
for their competitors), then nothing we write in any
specification can prevent that.

On the other hand, if reverse engineering is caused by
(b), then making the specification more precise and
clarifying ambiguities would be helpful.

But in neither case is the introduction of a version
indicator itself seem to have any effect.

Are there any other situations when reverse
engineering is useful or for which the cost
is a consideration for the HTML standard?

* What are the costs associated with reverse engineering?
When are those costs "vast"? 

Although there is some literature on the general costs
or processes of reverse engineering, this particular argument 
seems to be addressed at the past cost of determining
Internet Explorer behavior.

"vastly increased" or "greatly increased" costs are
significant, presumably, if they are significant in
the cost of the overall development of the software
which is attempting to be compatible.

In general, though, the cost of reverse engineering
existing HTML software's behavior in responding to 
version indicators has already been paid; whether it
was or wasn't "vast" in the past, is there any reason
to believe that allowing an optional version indicator
in HTML would add any additional "reverse engineering"
costs at all?

In any case, I don't see how this argument applies to
THIS actually change proposal. It's an interesting
data point, a sore spot, a rationale for providing
more precision than has been the norm in previous
HTML specifications.

Yes, if version specific behavior were allowed or
encouraged, that *might* increase the relevance of the
"reverse engineering costs" argument, although
even then the behavior of market-leading implementations
against specifications is really out of the control
of specification writers.

Larry
--
http://larry.masinter.net
Received on Saturday, 2 January 2010 21:46:05 UTC