Re: Proposed TAG Finding: Internet Media Type registration, consistency of use

After reading Tantek's plea to avoid treating HTML-like text/plain 
documents as plain text (rather than HTML), I have done some research to 
attempt to clarify the web status quo. The following data was obtained 
from the mozilla.org Bugzilla installation, in the "Tech Evangelism" 
component.

31 bugs were filed on Tech Evangelism for websites that served HTML pages 
as text/plain. Of these, 6 had changed significantly or were no longer 
available, leaving 25 mis-served pages. (By way of comparison, there 
exists 54 bug reports on Mozilla's refusal to accept non-text/css 
stylesheets, which only occurs in "strict mode", and since the 0.9.7 
release on Dec. 21 2001.) The types of web servers hosting these pages 
were determined from the bug report, if given there, or if not, from 
netcraft.com. Of these, only 2 ran on server software released before the 
year 2000--one on Apache 1.3.3 and one on Netscape Enterprise 3.6 SP1. 5 
ran on software released during the year 2000 (Apache 1.3.11-1.3.14), 14 
on software released during the year 2001 (Apache 1.3.19-1.3.22 and 
thttpd/2.21 20 Apr 2001), and 4 on software released this year (Apache 
1.3.23-1.3.24 and SAMBAR 5.1). Interestingly, 16 of the 31 were apparently 
due to pages with extensions such as .shtml, .php3, .asp, etc. not being 
mapped to the proper MIME-type.

Based on this data, I would tend to conclude that the supposed "web 
ghetto" of obsolete servers which Tantek invokes simply does not exist. 
Indeed, given that Netscape has never "supported" HTML-as-text/plain, it's 
unlikely that this misconfiguration could have become prevalent before 
~2000, when Netscape's market share began to fall precipitously. In 
summary, I don't feel that following the TAG's example will, in fact, 
cut off a significant portion of the web.

-- 
Chris Hoess

Received on Monday, 10 June 2002 18:44:18 UTC