- From: olivier Thereaux <ot@w3.org>
- Date: Mon, 25 Jun 2007 10:25:24 +0900
- To: Marc-Antoine Ross <marc@proze.net>
- Cc: www-validator@w3.org
Hello Marc-Antoine, On Jun 13, 2007, at 00:25 , Marc-Antoine Ross wrote: > Recently, the W3C blacklisted our service because it was using too > much ressources on their servers, according to their policy. To be fair, you should also mention that I had contacted you a number of times in the span of a year, urging you to install a local validator because your service, albeit indeed very nice, was sending a lot of requests to the validator, and in fast and large bursts, which I explained at the time, was not correct behavior according to the usage policy for the API. When we switched on automatic protection mechanisms to protect the validation service from abuse, your site was logically blocked... > 1/ Install the validator locally on our server and run it from > there. A good advantage would be that each page will be read only > once by the service compared to twice before (one by validator.ca > and once by the W3C). Problem: I don't have anybody available to > install the validator. More than a year ago, you said you'd talk to your admin about installing the validator locally, and I offered to get in touch to help. The validator's installation is not trivial, but it's not rocket science either, and many have managed to install it on their server in the past. Anyway, I reiterated my offer to help a few times, and reiterate it here again. > 3/ Offer the code and my contribution to the W3C and make this > service widely available and supported. As I told you a few times, I think your project is very cool, and if you want to release it as open source and contribute it to W3C, that is very welcome. There are a number of reasons why I think we may not be able to integrate the code "as is" in the validator. A minor reason is that it relies a lot on javascript/XHR. This makes for a cool UI, but without a fallback mechanism, we can't offer it to our wide user base. A more important reason is the cause of your blacklisting: your validator interface sends large and fast bursts of requests to the validator, without any consideration of current server load, and if we were to open such an interface at validator.w3.org, it would likely kill our servers, or make them unusable for its millions of users. The indexing also sends lots of requests to the validated site, and indexes it without considering the preferences of the webmaster in the robots.txt. We can't really do that. I think, given these constraints, the "perfect" batch-validation service would: * get users to register a batch job. Ideally with a mail loop, checking that the person who requests the batch job actually owns the site that is being validated. * The validator keeps a queue of batch jobs to be made, and processes them (server-side) whenever the server load allows it. Indexing of the site would be done with a pause between requests, and would respect the robots.txt protocol. Alternatively, the user can request validation of a list of URIs. * The batch-validation would use the validator's API to get results. * Once the batch job is finished, the requesting user would receive a mail with a summary of the job, and a link (or the content in the mail?) to the full results. Admittedly this is getting a bit far from your batch validator, but we do have some building blocks here and there that could be used (e.g the logvalidator). If you think your code could be used, adapted, or if you're interested in participating in the development of the validator in that direction, the door is always open. Regards, -- olivier
Received on Monday, 25 June 2007 05:55:43 UTC