- From: Nikita The Spider The Spider <nikitathespider@gmail.com>
- Date: Thu, 6 Mar 2008 21:12:50 -0500
- To: "Frank Ellermann" <hmdmhdfmhdjmzdtjmzdtzktdkztdjz@gmail.com>
- Cc: www-validator@w3.org
On Thu, Mar 6, 2008 at 2:18 PM, Frank Ellermann
<nobody@xyzzy.claranet.de> wrote:
>
> Nikita The Spider The Spider wrote:
>
>
> > And the following doctypes:
> > 2840: <!DOCTYPE HTML PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
> > "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
> > 830: <!DOCTYPE HTML PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
> [...]
>
> > 34: <!DOCTYPE HTML PUBLIC "-//W3C//DTD XHTML 1.1//EN"
> > "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
> [...]
>
> > And the following media types:
> > 5121: text/html
>
> Not a single application/xhtml+xml, XHTML 1.0 is alive and kicking.
I get a few of them, though not many. For instance, another sample of
300 sites/265872 pages gives this distribution of media types:
264246: text/html
1636: application/xhtml+xml
I think Nikita may see fewer application/xhtml+xml pages than are in
the wild. One reason is that her user agent string is simply "Nikita
the Spider (http://NikitaTheSpider.com/)" -- no "mozilla-compatible"
or other strings aimed at influencing code that sniffs user agents.
The other reason is that she sends an Accept header of "*/*" which of
course doesn't exclude application/xhtml+xml, but neither does it
explicitly mention it. I would guess that many servers that
conditionally send application/xhtml+xml explicitly check for the
presence of that string in the Accept header and if it isn't present
fall back to text/html.
--
Philip
http://NikitaTheSpider.com/
Whole-site HTML validation, link checking and more
Received on Friday, 7 March 2008 02:13:02 UTC