- From: Aryeh Gregor <Simetrical+w3c@gmail.com>
- Date: Wed, 4 May 2011 19:06:30 -0400
- To: www-archive <www-archive@w3.org>
(Note: this is my personal opinion. It does not represent Google's position.) First, the full HTML5 specification is available from the WHATWG under a permissive license: http://svn.whatwg.org/webapps/ So the issue is what message the W3C should be sending to the larger world, and what precedent we should set for other W3C specs, not whether people can actually fork (they can anyway). I'm strongly opposed to any license that does not explicitly permit forking, including all three of the W3C license options. I would support any widely-used permissive license (such as CC0, MIT, or three-clause BSD). The primary argument against a license that allows forking is that the existence of multiple competing standards would be damaging, since it would fragment the market, confuse people, etc. No one disputes the fact that forks are very bad, compared to working out problems without forking. It is also clear that if HTML5 were restrictively licensed, rewriting it from scratch would be extremely hard. HTML5 is very large and detailed, and rewriting it would require years of full-time work by highly skilled editors. Now, the only reason anyone would want to base a new standard on HTML5 is because they want to write a web-compatible standard. Most of the standard, for example the entire parsing section, makes no sense for any purpose but processing legacy content. It is chaotic and complicated, and most of it is fine details that could be rewritten much more simply if not for legacy constraints. Anyone writing a standard that doesn't need compatibility with existing content could use only a small fraction of the HTML5 text, which it could easily rewrite, so licensing restrictions would not affect it. Furthermore, the only reason a prospective forker wouldn't be able to easily write their spec as a list of changes to HTML5 is because they want to make extensive and detailed changes. If they're only adding new features, or removing existing features, or making minor adjustments here and there, it would be easiest to just state the differences. This would make their standard much shorter and easier to maintain -- and also presumably not subject to licensing restrictions. A list of changes would only be harder to use and maintain if the changes affected many particular details, so that the reader would have to jump back and forth to understand the meaning and the maintainers would have to update regularly to merge with changes in HTML5. But the only ones with the motive to write a web-compatible spec that differs from HTML5 in extensive and detailed ways are implementers of browser engines. Compatibility with existing content is extremely important to them, because users can easily switch to another browser if pages don't work right. They need to be able to refine the spec continually to make its algorithms match web content better as bug reports come in. Nothing other than a browser engine needs this level of compatibility, because nothing other than a browser engine is expected to process HTML exactly like browser engines do. The only time implementers of browser engines would want to fork HTML5 would be if either the implementers of all major engines felt that continuing HTML5 development at the W3C was seriously problematic, or if the implementers of at least two major engines felt that it was so disastrously harmful that they were willing to abandon compatibility with the remaining engines. Minor browser engines don't matter here, because they're forced to follow whatever standard the large browser engines do, or risk losing market share. A single browser engine wouldn't fork either, because a standard that expects to have only one implementation is useless. If the development of HTML at the W3C ever degenerates to the point that multiple major implementers think it's better to fork than to continue at the W3C, the W3C has ipso facto failed at its job as a standards body, and the fork will be a *good* thing. Given how disruptive a fork is, such a situation can only happen when the W3C persistently and flagrantly ignores the real-world needs of implementers, which in turn are forced by market pressure to reflect the needs of users. If the W3C were able (such as by licensing restrictions) to prevent a fork even when it has failed that terribly, it would only be destructive to the web. Thus while most forks are bad, the very special class of forks that would actually be hindered by licensing restrictions are most likely *good*, if not indispensable. Those serve as a safety hatch in case the W3C fails badly at its job. Unfortuately, such forks are not only theoretical, because the W3C did fail badly at its job in recent memory. Before HTML5 was started in 2004, the last HTML standard that defined features authors could use in practice was released in 1998. The work after that date was all on XHTML, which was never used significantly by authors. When Mozilla and Opera asked the W3C if they could work on adding non-XML features to HTML that would be usable by authors in the short term, the W3C refused, so they created the WHATWG (along with Apple) to work on a standard that would be useful to them. Eventually the W3C acknowledged that the WHATWG work was the right path to pursue, forming this HTMLWG in 2007 to work on the WHATWG's HTML standard and disbanding the competing XHTMLWG in 2009. The W3C's inattention to author needs from 1998 to 2007 was extremely harmful, and directly contributed to the flourishing of proprietary technologies. Increases in computing power and bandwidth made major new web applications practical, such as video and in-browser 2D graphics, but not even basic support was added to any standard that browsers felt they could implement. Thus Macromedia Flash (now Adobe Flash) gained nearly 100% market share, and it has become so integral to web content in practice that typical users would rightfully regard a browser that didn't support Flash as broken. It's difficult to imagine a greater failure of the standards process. Only because the WHATWG fork defined standard video and canvas tags is it becoming practical to even begin loosening Flash's stranglehold on the web, and that will take years yet. Nor is it merely conjecture that other types of forks do not need to reuse the specification text, as evaluation of actual historical forks shows. A partial list of HTML forks not sanctioned by the W3C is given at <http://wiki.whatwg.org/wiki/Forking#Existing_forks_of_HTML>: ISO HTML, WML, XHTML-MP, WTVML, WHATWG HTML, CE-HTML, EPUB, and HTML 4.1. Of these, I couldn't figure out how XHTML-MP, WTVML, or CE-HTML work, because the standards don't seem to be available for free online. ISO HTML and EPUB are both defined purely by reference to preexisting HTML standards, only listing particular changes, so they could not be prevented by licensing. WML is only loosely based on HTML, and wasn't intended for compatibility with existing web content or browsers, so it never had any need to reuse specification text. HTML 4.1 is a rewrite from scratch, but is organized by an ad hoc group on a wiki, seems to be entirely inactive, and doesn't show any indications of interest by any implementers at all, so it can safely be ignored. WHATWG HTML is the only one that actually would have benefited from a large body of existing high-quality spec text to build on, had such a body existed when it was forked in 2004 -- and that fork was unequivocally good. By contrast, those who argue that forking is bad have failed to provide any concrete examples of bad forks that would have benefited from a permissive specification license. Because the WHATWG has made HTML5 available under a permissive license since 2004, such examples should be easy to come by if they were likely to come up. But to my knowledge, none exist. For instance, EPUB3 is based on HTML5, so it could have forked the text, but it actually just defines a list of changes. Thus we have clear real-world evidence set against unsubstantiated conjecture. Another objection to allowing forking is that companies will not want to pay specification editors if they have no control over the results. This might be true in some cases, but HTML5 editing is paid for by Google, which already releases it under a permissive license at the WHATWG. More generally, this is an argument against requiring all W3C specs to be permissively licensed, but it is not an argument against licensing specific specifications permissively if the editors' employers want it. Also, companies do not currently retain rights to the specifications they sponsor, as a matter of course: they're required to assign copyright to the W3C. Permissive licensing gives them *more* rights to the work, since they can continue it outside the W3C if they see fit. Wayne Carr also raises the possibility that organizations might create device-specific variants of specs instead of working within the W3C. This has actually happened, as in the case of WML. However, as I explain above, such specifications will not have any need for the HTML5 text itself, and will not be affected by its license. WML, for example, is only loosely based on HTML. In other cases, vendors add features to support their devices, but again, there is no need to reproduce any HTML5 spec text to add new features. For instance, some of Apple's proprietary iOS extensions are documented here, and no text from HTML5 is present: http://developer.apple.com/library/safari/#documentation/appleapplications/reference/SafariWebContent/ConfiguringWebApplications/ConfiguringWebApplications.html
Received on Wednesday, 4 May 2011 23:07:17 UTC