- From: Brian Wilson <bloo@blooberry.com>
- Date: Thu, 16 Oct 2008 01:13:44 -0700 (PDT)
- To: www-validator@w3.org
I don't want to steal thunder from Philip's announcement yesterday of his "By The Numbers . Fall 2008" study, but it is also time for me to announce another validation study. I've written a tool called MAMA ("Metadata Analysis and Mining Application"), which analyzes a Web page and tracks as many of its structures as possible (including markup, CSS and scripting). As part of this process, all pages analyzed are also run through the W3C markup validator. So far, ~3.5 million URLs have been analyzed. I've been working on this project for quite some time now and it is finally time to share some of its findings...starting with validation. Condensed validation highlights: http://dev.opera.com/articles/view/mama-markup-validation-report/ Full validation study (long): http://dev.opera.com/articles/view/mama-w3c-validator-research-2/ Here is a peek at the index of the full version: 1. About markup validation: an introduction 2. Previous validation studies 3. Sources and tools: The URL set and the validator 4. What use is markup validation to an author? 5. How many pages validated? 6. Interesting views of validation rates, part 1: W3C-Member companies 7. Interesting views of validation rates, part 2: Alexa Global Top 500 8. Validation badge/icons: An interesting diversion? 9. Doctypes 10. Character sets 11. Validator failures 12. Validator warnings 13. Validator errors 14. Summing up ... 15. Appendix: Validation methodology MAMA's main analysis of the URLs occurred in November 2007 but the validation portion occurred in January 2008. After completing all that, doing a write-up of the validation findings was the first topic I tackled. I figured that the section would be fairly brief. Boy, was I wrong; it turned out to be the *longest* of any of MAMA's topics. There is a lot to say about the process of validation! The validation study was written specifically with the W3C validator mailing list in mind, so it gets technical and long-winded at times. The extra levels of detail should create added fun and mystery for one and all. The validation study is also the first of MAMA's main results being released. The main index for the MAMA project results is here: http://dev.opera.com/articles/view/mama/ and provides a lot of additional information, including motivation, a quick summary of some of the major results, and some of MAMA's methodologies. The index will be where you can access the new articles as they come out (about 2 dozen over the coming weeks on different Web page topics). These won't be directly about validation but may still be of interest. For the future, the plan is for MAMA to continue this mass-validation process in regular intervals so as to provide additional data about how Web page validation trends changes over time. Bringing things back to Philip's study, I think comparisons and differences between our two studies can produce interesting points for further discussion. Many thanks to Philip, Olivier and Karl for discussions and input along the way on MAMA's validation study. I hope you all find it worth the read. Thanks, -Brian Brian Wilson --------------------------"Those aren't Sex muffins! -Coach bloo@blooberry.com ---------------------Those aren't Love muffins! http://www.blooberry.com ---------------Those are just BLOOberry muffins!" Creator of Index DOT Html/Css: http://www.blooberry.com/indexdot/
Received on Thursday, 16 October 2008 08:14:20 UTC