- From: <bugzilla@jessica.w3.org>
- Date: Sun, 15 Nov 2015 02:52:10 +0000
- To: www-validator-cvs@w3.org
https://www.w3.org/Bugs/Public/show_bug.cgi?id=29291
Bug ID: 29291
Summary: robots.txt on 1 site supposedly blocking some URLs in
other sites
Product: Validator
Version: HEAD
Hardware: PC
URL: http://cold32.com
OS: Linux
Status: NEW
Severity: normal
Priority: P2
Component: check
Assignee: dave.null@w3.org
Reporter: Nick_Levinson@yahoo.com
QA Contact: www-validator-cvs@w3.org
CC: dave.null@w3.org, www-validator-cvs@w3.org
Target Milestone: ---
The W3C Link Checker, when I entered <http://cold32.com>, allowed 10 levels of
recursion (more than needed), and set it to send the Referer, didn't check a
small percentage of links because it supposedly was blocked by <robots.txt>,
but I saw no block in http://cold32.com/robots.txt and don't know why any other
robots.txt file would control this:
from <http://cold32.com/4/clothing-and-hair/2/where-to-buy-coats.htm>:
http://www.gutenberg.org/cache/epub/7213/pg7213.txt
from <http://cold32.com/5/action/6/showers-but-not-heaters.htm>:
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2094925/
from probably most *.htm and *.html pages:
http://pagead2.googlesyndication.com/pagead/js/adsbygoogle.js
For this bug report, I guessed the component and the version; the version is
actually 4.81.
--
You are receiving this mail because:
You are the QA Contact for the bug.
Received on Sunday, 15 November 2015 02:52:21 UTC