Re: survey of top web sites from Sander Tekelenburg on 2007-04-27 (public-html@w3.org from April 2007)

From: Sander Tekelenburg <st@isoc.nl>
Date: Fri, 27 Apr 2007 02:25:30 +0200
To: public-html@w3.org
Message-Id: <p06240606c256eaeaa410@[192.168.0.101]>

At 12:05 -0400 UTC, on 2007-04-25, David Dailey wrote:

[...]

> It is natural to ask "are popular sites representative of the
> web as a whole?"

I don't think they are at all. See below.

But as I understand it, the idea of tracking the 'top 200 pop sites' is that
if a HTML 5 implementation would break anything there, it would affect many
users and be very visible. In that sense it might be useful to track them, as
long as we realise that not breaking them doesn't mean not breaking 200
million less popular sites.

Ian Hickson says he has reviewed I believe 100 million or so web pages' code,
so that is probably more useful data.

> There are at least two differences between popular and "other" that
> we might expect: 1. popular sites are probably less likely to engage
> in "adventurous" behavior (unless you are one of the companies
> represented on W3C HTML WG of course) -- that is,  they are less
> likely to push frontiers and edges of use-cases. Too much is at stake
> to be very experimental

Once they are popular, they'll probably be afraid to make changes, yes. But
to *get* popular typically does require being somewhat adventurous.

> 2. they are more likely to be coded well.

>From what I've seen popular sites are coded at least as poorly as any other
site, if not more so. (Sites are open source, so it's easy for anyone
interested to judge for themselves.) They tend to serve invalid HTML, be CSS,
javascript, Flash, cookies dependant, etc. My impressoin is they're not
coding against specs, but against one or two particular implementations and
are probably in part relying on broken authoring tools.

I say "if not more so", because Joe Average's blog is likely to be simpler.
Joe doesn't know how to make things complicated, so he'll probably not engage
in browser sniffing, tracing users through cookies, make fancy things happen
through javascript. So at least in that sense Joe is less likely to go wrong.

(Mind you, neither are likely to attempt to provide semantic markup if their
tools don't invite them to. "If that div looks big enough, who needs a
stinking h1?")

-- 
Sander Tekelenburg
The Web Repair Initiative: <http://webrepair.org/>

Received on Friday, 27 April 2007 00:31:20 UTC