- From: David Dailey <david.dailey@sru.edu>
- Date: Mon, 30 Apr 2007 11:41:57 -0400
- To: Karl Dubost <karl@w3.org>
- Cc: connolly@w3.org,www-archive@w3.org,st@isoc.nl,zdenko@ardi.si, sean@elementary-group.com
+www-archive -www-public At 10:43 PM 4/29/2007, Karl wrote (in message at http://lists.w3.org/Archives/Public/public-html/2007Apr/1704.html) : >Doing a survey is tricky but very interesting, we need to clearly >define the methodology so that we know how to interpret the results. >Some previous results gave only the compiled results which makes it >difficult to interpret. Hi Karl, As I mentioned (http://lists.w3.org/Archives/Public/public-html/2007Apr/1544.html), Sander and I began having possibly related discussions of methodology somewhat in parallel but offlist, since there seem to be two differing ideas about why one might want to do such sampling of web sites. I had suggested a slightly different methodology than what you suggest, thinking it may or may not prove to be of interest. At the end of this message are some of my comments on such a methodology: My idea was to form a stratified sample of web pages at each of several points of the spectrum of web pages: a) top 200, b) Alexis 500, c) random, and d) "weird" or fringe cases that would be assembled by hand. And then to cross that with a variable representing instances of either standards or browsers Your approach (to what may ultimately be a different problem) considers a number of things I didn't. Though the browser sniffing stuff you mention is something I was thinking about. I don't know if one can robotically parse a document so that it looks like it would in Opera, FF, Safari, IE, etc. or not. I was rather naively assuming a fleet of grad students would fill out that part of the experimental design by hand. The other thing that is relevant to the discussion I think is the issue of the many different kinds of web content (sorta like you mention) -- blogs, news feeds, ordinary web pages, wikis, HTML fragments, print, email, etc. That could get complicated fast it seems. Also germane to the discussion may be some of the stuff that I think the folks interested in usability studies might be concerned with. See for example http://lists.w3.org/Archives/Public/public-html/2007Apr/0962.html, in which the classes of pages are further classified into types by author types (e.g. search engines v corporate etc.) It may make some sort of sense to convene a conversation unioning both the survey and the usability folks, since some of the methodological concerns may in fact overlap. Just an idea -- thinking out loud. David --------<quote>--------------------- The other two folks I mentioned [zdenko and sean, cc-ed above] are involved in the business of sampling the 200 sites, so it might be best to get them involved as well. I didn't sign up for this particular task since standards effectiveness is a more tangential concern of mine. (though I am really glad someone is looking at it.) I would tend to think the methodology oughta look something like this method of evaluation standards browsers S1 S2 S3 B1 B2 B3 B4 p p1 a p2 g p3 e p4 s p5 where both standards and browsers are used as repeated measures for pages. Pages are randomly chosen within categories C={Top200/50, Alexis500/50, random50, weird50) One samples 50 of each category and then one has a classical mixed model analysis of variance with repeated measures and only one random effects variable. Dependent variable can be either discrete (+ or -) or continuous. Doesn't much matter last time I studied statistics. Then we have a somewhat striated sample that can be compared across sampling strategies. But the idea is to sample as divergent a group of pages as possible. To get the random 50 -- I'm not sure what the best methodology is -- I suggested StumbleOn (but it has its own idiosyncracies) -- I remember some search engines have a "find a random page" feature so one might be able to track down how they do that. Someone on our group must know. To get a weird 50 -- I have a couple of ecclectic collections <http://srufaculty.sru.edu/david.dailey/javascript/various_cool_links.htm>http://srufaculty.sru.edu/david.dailey/javascript/various_cool_links.htm is one <http://srufaculty.sru.edu/david.dailey/javascript/JavaScriptTasks.htm>http://srufaculty.sru.edu/david.dailey/javascript/JavaScriptTasks.htm is another Both are peculiar in the sense that they attempt to probe the boundaries of what is possible with web technologies -- some are heavily Flash some are heavily JavaScript -- many don't work across browsers and in many cases I don't know why. Too busy to track it all down. (some of my pages are several years old and used to work better than they do now). My emphasis has been far less on standards than on what works across browsers -- the standards and browsers generally seem to have so little to do with one another. A proper methodology for weird sites: have a group of volunteers explain what they are looking for (a collection of fringe cases) and let others contribute to a list. I don't know. A simpler methodology: have a group of volunteers just sit and come up with a list of sites believed to push the frontier. ------------</quote>--------------------
Received on Monday, 30 April 2007 15:41:48 UTC