W3C home > Mailing lists > Public > www-validator@w3.org > May 2020

Re: IO Error whenever page is named contact.html

From: Michael[tm] Smith <mike@w3.org>
Date: Tue, 5 May 2020 20:25:55 +0900
To: Leonid Batkhan <leonid.batkhan@lenetek.com>
Cc: www-validator@w3.org
Message-ID: <20200505112555.GA114367@sideshowbarker.net>
OK, I looked into this and that short answer is that it’s a hosting issue
with the https://www.usa-travel/ site and with a number of other sites. And
there is nothing we can do from the W3C side to fix it.

If you think this problem is affecting a site you run, what you can do is:
Tell your hosting provider to whitelist the 128.30.52.0/24 subnet. That is
the IP range for the W3C validator service.

The longer answer is that there appear to be a number of hosting providers
or sites that are running some kind of blocking mechanism which checks the
IP address of each request, and if (1) they find that the IP address is in
some IP address is in some blocklist they use, and (2) the request URL has
“contact” or “register” in the path, then the mechanism causes the server
to respond with a 409 error.

The sites with this issue all seem be sites that are are running Wordpress
or in some cases maybe not running Wordpress but just running a PHP backend.

And it’s possible that the mechanism behind this issue is the software
system called “Wordfence”.

Regardless, whatever the system is that’s doing this, it appears to rely on
checking some kind of distributed blocklist of IP addresse — and the W3C
validator IP address range ended up in that blocklist.

So, as I mentioned above, if you think your site is affected by this, then
ask your hosting provider to un-block the 128.30.52.0/24 subnet, or ask
them to get the 128.30.52.0/24 subnet removed from whatever distributed
blocklists they’re using — or else ask them to quit using altogether
whatever they find the 128.30.52.0/24 subnet IP addresses in.

Whatever blocklists exists that have 128.30.52.0/24 IP addresses in them
are bad, broken, poorly-administered blocklists that nobody should be
relying on. There is nothing originating from those (W3C) addresses that
even remotely could be considered abuse — nothing that would merit those IP
addresses ending up in the blocklist.

And if W3C server IP addresses are in a blocklist mistakenly, it is very
likely that quite a few other legitimate IP addresses are mistakenly in
that same blocklist. And the effect of that would be that you have users/
customers who aren’t able to access any pages at your site which have
“contact” or “register” in the page filenames/paths.

Leonid Batkhan <leonid.batkhan@lenetek.com>, 2020-05-04 18:27 -0400:
> Archived-At: <https://www.w3.org/mid/002301d62263$46945040$d3bcf0c0$@lenetek.com>
> 
> I tried validating several websites, and noticed that whenever page is
> called contact.html I am getting the following
> 
> 1.      IO Error: HTTP resource not retrievable. The HTTP status from the
> remote server was: 409.
> 
> https://www.usa-travel.us/contact.html
> 
> Here is a screenshot:
> 
> 
> 
> I checked that on several websites with consistent results which only
> affects pages named contact.html.  Even when I renamed perfectly validated
> page to be named contact.html that page stops being validated.
> 
> Could you please let me know what is going on?
> 
> Thank you in advance.
> 
> Leonid Batkhan
> 
>  
> 
>  
> 
> 



-- 
Michael[tm] Smith https://people.w3.org/mike

Received on Tuesday, 5 May 2020 11:26:15 UTC

This archive was generated by hypermail 2.4.0 : Tuesday, 5 May 2020 11:26:15 UTC