Re: Web SUBpages rejected with "Bad hostname" from Debbie Mitchell on 2011-08-16 (www-validator@w3.org from August 2011)

From: Debbie Mitchell <debbiem@companyv.com>
Date: Tue, 16 Aug 2011 12:50:43 -0700
To: erlkonig@talisman.org, www-validator@w3.org
Message-Id: <20110816194824.M89219@companyv.com>

When I validate: http://www.talisman.org/~erlkonig/img/

I get:

Validation Output: 1 Error  
   Line 2, Column 57: no system id specified<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN">
 

Your document includes a DOCTYPE declaration with a public identifier (e.g. "-//W3C//DTD XHTML 1.0 Strict//EN") but no
system identifier (e.g. "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"). This is authorized in HTML (based on
SGML), but not in XML-based languages.

If you are using a standard XHTML document type, it is recommended to use exactly one of the DOCTYPE declarations from
the recommended list on the W3C QA Website.

    Line 10, Column 1: Missing xmlns attribute for element html. The value should be: http://www.w3.org/1999/xhtml<html>
 

Many Document Types based on XML need a mandatory xmlns attribute on the root element. For example, the root element for
XHTML might look like: 
 <html xmlns="http://www.w3.org/1999/xhtml">

---------- Original Message ----------- 
 From: "Jukka K. Korpela" <jkorpela@cs.tut.fi> 
 To: www-validator@w3.org, erlkonig@talisman.org 
 Sent: Tue, 16 Aug 2011 22:42:41 +0300 
 Subject: Re: Web SUBpages rejected with "Bad hostname"

> 16.8.2011 16:40, I (Jukka K. Korpela) wrote: 
> 
> > 16.8.2011 12:08, C. Alex. North-Keys wrote: 
> [...] 
> >>> 1. I got the following unexpected response when trying to retrieve 
> >>> <http://www.talisman.org/~erlkonig/img/>: 
> >>> 500 Can't connect to dont-waste-bandwidth-running-validator-here:80 
> > [...] 
> >> Of course, the validator was perfectly happy with other pages under the 
> >> same http://www.talisman.org/~erlkonig/ 
> > 
> > I suppose the issue is related to http://www.talisman.org/robots.txt 
> 
> Sorry, it seems that I was wrong about that - though I don't know 
> whether the validator actually requests for robots.txt. The contents of 
> robots.txt may reflect the site administration's intentions, but there a 
> more specific mechanism in action. 
> 
> It seems that the server www.talisman.org specifically handles a request 
> from the W3C Validator in a specific way. Testing with the HTTP request 
> and response analyzer 
> http://www.rexswain.com/httpview.html 
> using User-Agent: W3C_Validator 
> I get a response that consists of a 302 redirection to 
> http://dont-waste-bandwidth-running-validator-here/ 
> (That's of course a rather questionable way of excluding things. A 
> reasonable response would consist of some error code - not redirection - 
> and an accompanying error page.) 
> 
> So you need to contact the www.talisman.org server admin or to avoid the 
> issue by putting your HTML documents in a folder where they won't get 
> treated that way. I guess the "/img/" part in URL is the key; the server 
> admin may think that such folders contain images only (the robots.txt 
> contents is a hint of this 
> 
> -- 
> Yucca, http://www.cs.tut.fi/~jkorpela/ 
------- End of Original Message -------

Received on Tuesday, 16 August 2011 19:51:13 UTC