W3C home > Mailing lists > Public > www-validator@w3.org > February 2007

Re: Invalid pages...

From: olivier Thereaux <ot@w3.org>
Date: Fri, 16 Feb 2007 10:00:39 +0900
Message-Id: <E17DC33D-09C8-40F6-B8F4-EC9EED2B6A27@w3.org>
Cc: www-validator@w3.org
To: Bachu <bachu9@gmail.com>


On Feb 15, 2007, at 03:11 , Bachu wrote:

> Hai I am Bachu from India, why is that in w3.org has 7% of pages ie  
> 14 pages didnt pass the w3 validation? is this some error in  
> validator or in the pages ?
>

Hello Bachu.

Like Jukka, I am wondering where you got these numbers. There are  
millions of pages (literally) on w3.org Web servers, so 14 pages  
definitely isn't 7%. Also, when you say "w3 validation", what do you  
mean? There are many validation services at w3c - markup validation,  
CSS validation, Feed validation, RDF, P3P, etc.

Assuming your message ultimately means "why does www.w3.org have some  
pages that are not valid HTML", I can give a few elements of answers.

* While some proportion of the documents on w3.org (all of  
lists.w3.org, wiki pages and  w3c blogs) are managed by a CMS or some  
piece of software, a large number of pages are written "by hand", by  
a lot of different people (staff, collaborators, working-group  
participants). A lot of these pages are also edited with amaya [1],  
which produces valid markup - always.

* Given the way the site is edited, it is prone to errors. We  
therefore have a quality control process using tools such as the  
LogValidator [2] which sends a weekly report to the staff, listing  
the most popular documents on the www.w3.org site that do not  
validate, and the staff collectively works on improving them.

Our last results give about 98% of validity in the ~2000 most popular  
HTML documents on www.w3.org - not perfect, but given that most of  
these are edited by hand, it's not bad. And of course, given the  
iterative QA process, the figure is bound to improve.

We also have an opt-in service for people to receive personalized  
weekly reports about the validity of their recently edited documents  
(since all of our site is in CVS and we have their history) or about  
the validity of an area of the site they manage.

* The "official" pages on the W3C website, that is, the standards/ 
technical reports, are all validated before publication. And since  
quality goes way beyond validation, they are also checked for broken  
links, spelling mistakes (via an automatic spell checker)

Hope this answers your questions.
-- 
olivier
Received on Friday, 16 February 2007 01:00:50 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 25 April 2012 12:14:23 GMT