W3C home > Mailing lists > Public > public-evangelist@w3.org > March 2004

Re: The use of W3C standards in Denmark Part II -- some other data

From: Olle Olsson <olleo@sics.se>
Date: Sat, 06 Mar 2004 12:30:53 +0100
Message-ID: <4049B66D.5090404@sics.se>
To: Soren Johannessen <hal@ae35-unit.dk>
Cc: public-evangelist@w3.org


 >  does the Swedish government also recommend/advocate W3C standards

Yes they do.

At a political level there is no question about adherence to standards 
-- standards: "YES"

At a practical level, there is a great variation in how well different 
agencies succeed in complying to these standards -- standards: "we 
THOUGHT we were OK".

Part of the problem is that the public sector uses commercial tools for 
content-management/publishing, and such tools fail to comply to 
standards. The situation is gradually getting better, of course, but 
slowly (at the pace that Marko Karppinen detected in his investigation).

There was an initial unfounded expectation that if rules about web 
standards for the public sector are stated (at the government level), 
then everything will automatically become compliant. Now there is a 
growing insight that such rules only indirectly have an impact on the 
public websites:
    public policy is stated [standard compliancy]
     --> requirements for procurement of authoring/publishing tools
     --> tool vendors see a requirement from a potentially big market
     --> the vendors adapt their tools to these requirements
     --> the public sector agencies get good tools
     --> the public sector must learn how to use theese tools in correct 
     --> .... and then we will see good pages on the web

At the moment there are a number of initiatives in the public sector to 
support the complex of web-presence of the public sector. One aspect is 
compliance to technical standards, another is living up to accessibility 
requirements (WAI), yet other things concerns usability in the broader 
spectrum, as well as interoperability across different agencies (not 
only web services, but also syndication of information).

There was a broad investigation of accessibility (WAI) issues in the 
public sector -- it created a heated debate about how such evaluations 
should be performed. I know of no large scale investigation of 
compliance to technical standards in the public sector.

But, of course, such a study could be done.


Soren Johannessen wrote:

>Hi Olle
>One question does the Swedish government also recommend/advocate W3C
>standards in  Swedish  governmental/national/municipal authorities home
>page? It could be interesting to compare Denmark and Sweden 
>(also Norway & Finland if they have same politics for use of W3C
>Some statistics about correctness of web pages.
>... which confirms some of the data presented in this thread.
>Two years ago I made a minor survey of error frequency in Swedish web
>pages. The aim was to get some initial data about how well standards are
>adhered to.
>I only looked at commercial websites, and only on the home page of these
>sites. The commercial web sites selected were fetched from a list
>companies registered on the stock exchange, a list that provides
>information for financial analysts.
>The investigation was, for simplicity restricted to automatic validity
>checking, with some manual intervention in the selection and filtering
>The W3C HTML validator was the tool used, and no attempt was made to 
>obtain a
>finegrained classification of the types of errors found -- something
>that is inherently difficult, if one tries an approach based on
>automated analysis. At that point in time, the only quick-and-dirty way
>of accessing the results of the validator was to extract ínformation
>from the HTML page returned by the validator. Not a nice way, but it was
>doable. There was some discussion at that point that a more programmatic
>interface would be useful, e.g. "W3C Validator as a Web Service". But I
>had to make do with what was available, hence the indirect way of
>dissecting the page generated by the Validator.
>There were a number of situations that had to be clearly identified, if
>the outcome was to be trusted.  E.g., to be able to separate those cases
>where no
>real checking was made made by the Validator, I had to identify at least
>those occurences where the Validator complained that it could not check
>the page for some reason.
>Some statistics
>The initial list of companies (company web sites) that was used ...
> - number of initially selected pages: 330 pages
>But some sites were down or returned something that was not HTML, so...
> - number of actually investigated pages: 280 pages
>The sizes of the pages varied significantly:
> - minimum page size: 89 characters
> - maximum page size: 129 832 characters
>DOCTYPE was a big problem:
> - percentage of pages that did not declare DOCTYPE: 76 %
>... or at least no DOCTYPE was recognised by the validator!
>No page passed W3C Validator checking! The span in error remarks was
> - minimum number of errors: 1
> - maximum number of errors: 591
>As some pages triggered avalanches of errors, and some pages were
>extreme in size, they could create bad effects on the statistics. So
>"extreme" pages were filtered out, and statistics was only retained for
>pages that fulfilled the conditions:
> - size of page: 1,000 -- 50,000 characters
> - number of error remarks on page: < 200
>As to the reason for excluding "small" pages is obvious -- the size of a
>correct (and reasonable) "hello world" page is on the order of 200
>characters. Small pages are also found when FRAMESETs are used, and such
>pages were not further studied.
>This resulted in an effective set of pages ...
> - number of pages used in final statistics: 226
>This set was partitioned into two groups -- IT-companies and other
>companies. The reason for this partitioning was that one would think
>that IT-companies (e.g. IT-vendors or IT consultants) would be better at
>constructing correct pages than other companies (e.g. household
>appliance vendors or transport companies) ...
> - number of IT companies: 76
> - number of other companies: 150
>The result turned out to be that no clear difference, w.r.t. errors, 
>could be
>detected between these two types of companies, which was not what one
>would expect.
>There is a write-up (in Swedish ;-) ) of these things on:
>  http://www.w3c.se/resources/office/papers/memo1/memo1.html
>On that page there are also some diagrams that describe the correlation 
> * size-of-page vs numbers-of-errors
> * size-of-page vs numbers-of-errors-per-1000-characters-of-page
>Each green dot represents one of the 226 pages investigated.
>For those that might want to look at the diagrams, the texts in these
>can be translated as:
> * "antalet" = "number"
> * "sidstorlek" = "page size"
> * "anmärkningar" = "error remarks"
> * "antalet anmärkningar för sida" = "number of error remarks per page"
> * "antalet anmärkningar per 1000 tkn i sida" = "number of error remarks
>per 1000 chars in page"
> * "anmärkningar/1000 tkn" = "error remarks/1000 chars"
>The two last diagrams on that page portray the relationship between size
>of pages and the number of pages of that size. Here we aggregate pages
>into groups of pages (0-2500, 2501-5000, 5000-7500, ... page size in
>characters) and do the correlation of the size represented by each group
>to the 
>number of
>pages in each group. This is nicely described as a Zipf distribution.

Olle Olsson   olleo@sics.se   Tel: +46 8 633 15 19  Fax: +46 8 751 72 30
	[Svenska W3C-kontoret: olleo@w3.org]
SICS [Swedish Institute of Computer Science]
Box 1263
SE - 164 29 Kista
Received on Saturday, 6 March 2004 06:31:01 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 20:16:18 UTC