- From: Nikita The Spider The Spider <nikitathespider@gmail.com>
- Date: Thu, 6 Mar 2008 21:12:50 -0500
- To: "Frank Ellermann" <hmdmhdfmhdjmzdtjmzdtzktdkztdjz@gmail.com>
- Cc: www-validator@w3.org
On Thu, Mar 6, 2008 at 2:18 PM, Frank Ellermann <nobody@xyzzy.claranet.de> wrote: > > Nikita The Spider The Spider wrote: > > > > And the following doctypes: > > 2840: <!DOCTYPE HTML PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" > > "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> > > 830: <!DOCTYPE HTML PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" > [...] > > > 34: <!DOCTYPE HTML PUBLIC "-//W3C//DTD XHTML 1.1//EN" > > "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd"> > [...] > > > And the following media types: > > 5121: text/html > > Not a single application/xhtml+xml, XHTML 1.0 is alive and kicking. I get a few of them, though not many. For instance, another sample of 300 sites/265872 pages gives this distribution of media types: 264246: text/html 1636: application/xhtml+xml I think Nikita may see fewer application/xhtml+xml pages than are in the wild. One reason is that her user agent string is simply "Nikita the Spider (http://NikitaTheSpider.com/)" -- no "mozilla-compatible" or other strings aimed at influencing code that sniffs user agents. The other reason is that she sends an Accept header of "*/*" which of course doesn't exclude application/xhtml+xml, but neither does it explicitly mention it. I would guess that many servers that conditionally send application/xhtml+xml explicitly check for the presence of that string in the Accept header and if it isn't present fall back to text/html. -- Philip http://NikitaTheSpider.com/ Whole-site HTML validation, link checking and more
Received on Friday, 7 March 2008 02:13:02 UTC