Re: Page & validation statistics available

On Apr 7, 2008, at 09:47 , Nikita The Spider The Spider wrote:
> As a result of our conversation on validation statistics last month, I
> was inspired to collect some statistics based on the data my validator
> Nikita sees. If you're interested in the topic, you can read about it
> here:
> http://NikitaTheSpider.com/articles/ByTheNumbers/

Very cool, I read it with a lot of interest. Thank you very much for  
sharing it here.

Indeed as you wrote in the report, there is a bias to such a study,  
but I think that the particular (website) population being sampled is  
very interesting to us, too. Having stats on "sites that try to  
validate" in addition to the work being done on "the wild web" by e.g  
Brian, is very welcome.

One thing I was wondering, do you have stats on what ratio of the  
pages you tested passed the validation?

I recently made some tests (on rather small sets, a few hundreds at a  
time hence not necessarily reliable) of:
- finding URIs that had been validated within the last 24 hours
- looking at whether they were now valid
- seeing if there was a higher ratio of validity for pages that had  
been validated multiple times, as opposed to once.

I found that the ratio of valid pages (after 24 hours) was around 50%  
for "clients" of the w3c validators. Given the "modern" profile of  
nikita's clients (lots of utf8, lots of XHTML) I was wondering if it  
was similar, or higher.

Thanks,
-- 
olivier

Received on Tuesday, 8 April 2008 12:45:45 UTC