Re: IO Error whenever page is named contact.html from Michael[tm] Smith on 2020-05-06 (www-validator@w3.org from May 2020)

From: Michael[tm] Smith <mike@w3.org>
Date: Thu, 7 May 2020 08:54:35 +0900
To: Leonid Batkhan <leonid.batkhan@lenetek.com>
Cc: www-validator@w3.org
Message-ID: <20200506235435.GA2181@sideshowbarker.net>

Leonid Batkhan <leonid.batkhan@lenetek.com>, 2020-05-06 15:14 -0400:
> Archived-At: <https://www.w3.org/mid/006e01d623da$8b7ee440$a27cacc0$@lenetek.com>
> ...
> 
> However, given that all the above pages are accessible and retrievable
> via browsers and function just fine, why the validator can just ignore
> that 409 server response and still run validation report on the page. The
> validator should validate accessible and retrievable page based on its
> contents only and disregard that server code.

The HTML checker does in fact have an option that lets you choose to force
checking even when the response is an HTTP error. It’s in the checker UI at
https://validator.w3.org/nu/ — if you press the "Options..." button and
then select the "check error pages" option.

But in the case of the 409 error for https://www.usa-travel.us/contact.html,
that option is not very useful, because when the checker makes a request to
that URL, the following shows what the site sends back:

  $ curl -i https://www.usa-travel.us/contact.html
  HTTP/2 409
  date: Wed, 06 May 2020 23:34:50 GMT
  server: Apache
  content-length: 83
  content-type: text/html; charset=iso-8859-1

  <script>document.cookie = "humans_21909=1"; document.location.reload(true)</script>

That is, the server isn’t sending the normal/expected contents of the page at
https://www.usa-travel.us/contact.html. Instead the only contents it sends are:

  <script>document.cookie = "humans_21909=1"; document.location.reload(true)</script>

The following link shows what the checker reports when checking that response —

https://validator.w3.org/nu/?showsource=yes&checkerrorpages=yes&doc=https://www.usa-travel.us/contact.html

-- 
Michael[tm] Smith https://people.w3.org/mike

Received on Wednesday, 6 May 2020 23:54:50 UTC