RE: (Entire) Site validation

Hi

You could save bandwidth (and raise limits) by only displaying links to
sites that are invalid. When clicking on the links it shows you the details.
Displaying all errors is pretty intensive bandwidth wise. It is pretty messy
too. Another solution is to make a report in html and zip it up and then
send the result. When going large scale (up to 1000 pages) a ticket system
or something similiar has to be thought up so you could see the results when
finished because they would take up a lot of time (affected by the server
speed the site is running on, the validating server speed etc). Another
problem would be the bandwith when getting the pages from servers to
validate them. Some servers use gzip to zip up the content they are serving,
i don't know if the validator uses that technique to save bandwith, but it
should be a pretty good idea to use it. I could help in development of the
new validator, i don't know about the hosting though. 

Martin Salo

-----Original Message-----
From: Liam Quinn [mailto:liam@htmlhelp.com] 
Sent: Tuesday, April 20, 2004 9:27 PM
To: David Dorward
Cc: Martin Salo; www-validator@w3.org
Subject: Re: (Entire) Site validation

On Tue, 20 Apr 2004, David Dorward wrote:

> On 20 Apr 2004, at 15:32, Martin Salo wrote:
> > What do you think about an idea to develop a validator that validates 
> > whole sites, not only pages. It could search for links in pages and 
> > when a link is in the same domain (a subsection of the page) it 
> > validates it. It should be limited (to about 200 pages or so) to avoid 
> > abuse.
> 
> Such a validator already exists, 
> <http://www.htmlhelp.com/tools/validator/>, although its limit is 50 
> pages (a fact which, last week, led me to finally getting around to 
> setting up the w3c validator locally).

I just raised the limit to 100 pages.  (We recently moved to a new host
with cheaper bandwidth, so I was planning on raising the limit soon.)

> I have some concerns about the load this would cause on the w3c server; 
> it has a rather higher profile then the WDG. What is the load 
> (bandwidth / CPU) like at present (assuming there are no issues with 
> making that information public)?

The load depends a lot on the server that handles it.  On our old,
underpowered server, we had trouble with badly-behaved robots overloading
the server with too many CGI requests (sometimes involving the Validator,
but often other parts of the site as well).

I've changed the limit on number of pages a few times in the past, but the 
reason for lowering the limit was always to compensate for the extra 
bandwidth needed to handle Microsoft mass-mailing worms at our mail 
server.

-- 
Liam Quinn

Received on Tuesday, 20 April 2004 14:43:16 UTC