- From: Simon Pieters <simonp@opera.com>
- Date: Wed, 14 Dec 2011 09:48:13 +0100
- To: "Boris Zbarsky" <bzbarsky@mit.edu>
- Cc: public-webapps@w3.org
On Wed, 14 Dec 2011 09:15:12 +0100, Boris Zbarsky <bzbarsky@mit.edu> wrote: > On 12/14/11 3:01 AM, Simon Pieters wrote: >>> What I have so far as a result is a list of about 1.7 million >>> barewords used across several tens of thousands of pages. >> >> Do you have a more accurate figure for the number of pages? > > "57,444 unique urls, all taken from the top 21,000 domains" is all the > information I have there so far. Thanks! >>> If people are interested in the exact methodology, I can probably get >>> a description. >> >> I'm interested. It's hard to make conclusions from data without knowing >> what the data is, how it is biased, what false positives it might have, >> etc. > > Yeah, understood. Working on getting that description. > > -Boris cheers -- Simon Pieters Opera Software
Received on Wednesday, 14 December 2011 08:48:47 UTC