W3C home > Mailing lists > Public > www-validator@w3.org > November 2014

Content-Type not recognized

From: Jukka K. Korpela <jkorpela@cs.tut.fi>
Date: Mon, 24 Nov 2014 16:40:09 +0200
Message-ID: <54734349.50506@cs.tut.fi>
To: Anton Andrushchenko <tony@westsidewholesale.com>, www-validator@w3.org
Under “Subject: Add Subject Here”
2014-11-24 13:34, Anton Andrushchenko wrote:

> we are getting very strange error message trying to validate our website:
>
> http://westsidewholesale.com
>
> http://validator.w3.org/check?uri=westsidewholesale.com
>
> The message says:
>
> Sorry, I am unable to validate this document because its content type is
> |text/html,|, which is not currently supported by this service.

This is strange indeed. But there is something strange with the headers, 
too. Looking at response headers using Firefox Web Developer Extension shows

Content-Type: text/html

But looking at them in Chrome dev tools shows:

Content-Type:text/html; charset=UTF-8
Content-Type:text/html

And taking at a low-level look with
http://www.rexswain.com/httpview.html
we see (I guess) the actual response headers in the order sent:

HTTP/1.1·200·OK(CR)(LF)
Server:·nginx/1.0.14(CR)(LF)
Date:·Mon,·24·Nov·2014·14:18:29·GMT(CR)(LF)
Content-Type:·text/html(CR)(LF)
Connection:·close(CR)(LF)
Vary:·Accept-Encoding(CR)(LF)
X-Powered-By:·PHP/5.3.3(CR)(LF)
Content-Type·:·text/html;·charset=UTF-8(CR)(LF)
Pragma:·no-cache(CR)(LF)
Cache-Control:·no-cache,·must-revalidate,·no-store,·post-check=0,·pre-check=0,·max-age=31536000(CR)(LF)
Expires:·Tue,·24·Nov·2015·14:18:29·GMT(CR)(LF)
Vary:·Accept-Encoding,User-Agent(CR)(LF)
(CR)(LF)

Here “·” denotes space and “(CR)(LF)”, well, CR LF.

So there is one Content-Type: text/html header without a charset 
parameter and later another such header with such a parameter. Although 
both of them seem valid, maybe this is the cause of the confusion. Maybe 
the validator tries to combine the Content-Type headers into one and 
makes a mistake there.

However, I have been unable to reproduce the issue by sending such 
headers. The problem might somehow be in some specific combination in 
HTTP headers. (I cannot replicate the exact order of the headers, due to 
restrictions imposed by server-side technology.)

> Looks like the content-type property value is treated as comma-trailed,
> but we don’t have this neither in the HTML code not in response headers
> of the server.

My guess is that the validator somehow notices the presence of two 
Content-Type headers and then tries to construct a comma-separated list 
of their values, but fails. But this is just a very wild guess.

Anyway, if you can modify the HTTP response headers, try removing the 
first Content-Type header, the one without charset parameter.

Yucca
Received on Monday, 24 November 2014 14:40:43 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 13 September 2016 06:30:31 UTC