Re: Metrics sheet

From: Anja Feldmann (anja@research.att.com)
Date: Fri, Apr 23 1999


Date: Fri, 23 Apr 1999 10:17:06 -0400 (EDT)
Message-Id: <199904231417.KAA27951@spectrum.research.att.com>
From: Anja Feldmann <anja@research.att.com>
To: lavoie@oclc.org
Cc: www-wca@w3.org
Subject: Re: Metrics sheet

>http://www.oclc.org/oclc/research/projects/webstats/wcadrafts.htm

looking through the draft I see a lot of terms but no firm definitions. 

There are a lot of terms that can mean completely different things for
different sets of people or may not even have a unique answer. Here are just a
few examples that are questionable:
	  - geographical location (may not be unique, may move around,
	    change over time e.g. with reverse proxies, hashing) 
	  - number of files on server  (what is a file here anyway)
	  - what is the accessibility of a server (when is a server
	    inaccessible, after dropping 1 connection 10 connections, not
	    accepting connections for x seconds/minutes/hours?)
	  - how can one measure implicit vs explicit clicks?
	  - what about things measured at a proxy? Is the proxy considered to
	    be a client as well as a server? (If so what is the impact of
	    this on things like unique Web sites:-)
	  - where is the network in all of these definitions
	  - obtaining even approximations for core nor periphery is hard I
	    propose to stay away from ratios of things whos uncertainty may be
	    in the range of orders of magnitude
	  - how about addressing things such as time resolution? 
	  .....

I believe we need a slightly different approach here and that is for every
proposed metric we need at least one well defined way of actually computing
it. Otherwise the metric and the terms are useless.  Everyone will have a
different way of computing the metric and thus we will still compare apples to
oranges.

      Anja