- From: Steve Faulkner <faulkner.steve@gmail.com>
- Date: Wed, 17 Oct 2012 01:47:00 +0200
- To: HTMLWG WG <public-html@w3.org>
- Message-ID: <CA+ri+VndtQWCav0UZt7pp4YUT6GO4CwBh7J1wHMnc=FBGYy42w@mail.gmail.com>
Hi all, In the process of developing the <maincontent> element spec [1] I looked at data from a number of sources [3] on frequency of usage of id values to indicate the main content area of a web page. I also used data [2] I gathered in April 2012 based on a URL list of the top 10,000 most popular web sites. In preparing the data [2] I subsetted the total usable HTML documents (approx 8900 pages - the home pages for sites in the top 10,000 URLs list ) by searching for the use of the HTML5 doctype (approx 1545 pages). I figured that documents using the HTML5 doctype would provide the freshest code. What is apparent from the home page data in the sample: * use of a descriptive id to value to identify the main content area of a web page is common. (id="main"|id="content"|id="maincontent"|id="content-main"|id="main-content" used on 39% of the pages in the sample [2]) * There is a strong correlation between use of role='main' on an element with id values of 'content' or 'main' or permutations. (when used = 101 pages) 77% were on an element with id values of 'content' or 'main' or permutations. * There is a strong correlation between use of id values of 'content' or 'main' or permutations as targets for 'skip to content'/'skip to main content' links (when used = 67 pages) 78% of skip link targets # were elements with id values of 'content' or 'main' or permutations. * There appears to be a strong correlation in the identification of content areas (with id values of 'content' or 'main' or permutations.) as what is described in the spec as appropriate content to be contained with a <maincontent> element [1]: "The maincontent element represents<http://dev.w3.org/html5/spec/rendering.html#represents>the main content section of the body of a document or application. The main content section consists of content that is directly related to or expands upon the central topic of a document or central functionality of an application. ... The main content section of a document includes content that is unique to that document and excludes content that is repeated across a set of documents such as site navigation links, copyright information, site logos and banners and search forms (unless the document or applications main function is that of a search form)." I have prepared approx 440 sample pages [4] from the same URL set with CSS to outline and identify use of container elements with id values of 'content' and/or 'main' and role=main, these samples can be used to visually assess how closely the spec text matches the reality of element usage with the stated id values. The first link in each list item links to the original page the second link prefixed with "copy" is the same page with the CSS added. http://www.html5accessibility.com/tests/HTML5-main-content/ [1] https://dvcs.w3.org/hg/html-extensions/raw-file/tip/maincontent/index.html [2] http://www.paciellogroup.com/blog/2012/04/html5-accessibility-chops-data-for-the-masses/ [3] http://triin.net/2006/06/12/CSS#figure-34, http://westciv.typepad.com/dog_or_higher/2005/11/real_world_sema.html, http://dev.opera.com/articles/view/mama-common-attributes/#id -- with regards Steve Faulkner Technical Director - TPG www.paciellogroup.com | www.HTML5accessibility.com | www.twitter.com/stevefaulkner HTML5: Techniques for providing useful text alternatives - dev.w3.org/html5/alt-techniques/ Web Accessibility Toolbar - www.paciellogroup.com/resources/wat-ie-about.html
Received on Tuesday, 16 October 2012 23:48:08 UTC