W3C home > Mailing lists > Public > www-archive@w3.org > June 2008

Alexa 100 - Requests to distinct hosts

From: Bjoern Hoehrmann <derhoermi@gmx.net>
Date: Fri, 06 Jun 2008 03:55:23 +0200
To: www-archive@w3.org
Message-ID: <vm5h44972q0nvojf2eh7qvom3egis4080c@hive.bjoern.hoehrmann.de>

Hi,

  The following table represents the number of reqests to distinct hosts
that Opera 9.x would make when loading the respecitve Site. The list of
sites comes from the Alexa 100. Some sites had to be omitted due to tool
bugs. Also note that HTTP over TLS requests have been omitted.

  +---------------------------+-----+----+
  | Site                      | Req | #H |
  +---------------------------+-----+----+
  | www.nba.com               | 207 |  7 |
  | www.gamespot.com          | 186 | 19 |
  | www.cnn.com               | 168 | 17 |
  | www.ign.com               | 136 | 28 |
  | www.nytimes.com           | 131 | 19 |
  | www.espn.go.com           | 119 | 16 |
  | www.cnet.com              | 113 | 14 |
  | www.download.com          | 108 | 10 |
  | www.digg.com              | 101 | 15 |
  | www.gmx.net               |  98 |  9 |
  | www.fotolog.net           |  94 | 25 |
  | www.doubleclick.com       |  94 |  4 |
  | www.888.com               |  90 |  5 |
  | www.perfspot.com          |  87 | 15 |
  | www.bebo.com              |  84 | 19 |
  | www.amazon.com            |  81 |  7 |
  | www.comcast.net           |  80 | 10 |
  | www.aim.com               |  77 | 15 |
  | www.dailymotion.com       |  76 |  6 |
  | www.hp.com                |  76 |  5 |
  | www.veoh.com              |  74 | 14 |
  | www.veoh.com.pdml         |  74 | 14 |
  | www.metacafe.com          |  73 | 17 |
  | www.aol.com               |  68 | 11 |
  | www.youtube.com           |  68 |  9 |
  | www.weather.com           |  67 | 13 |
  | www.tinypic.com           |  62 | 15 |
  | www.adobe.com             |  61 |  3 |
  | www.msn.com               |  59 | 15 |
  | www.partypoker.com        |  59 |  6 |
  | www.icq.com               |  58 |  6 |
  | www.isohunt.com           |  57 | 14 |
  | www.myspace.com           |  56 | 18 |
  | www.ebay.co.uk            |  55 |  8 |
  | www.yahoo.com             |  54 | 12 |
  | www.pornhub.com           |  54 |  9 |
  | www.bbc.co.uk             |  50 |  8 |
  | www.ebay.com              |  50 |  7 |
  | www.mininova.org          |  49 | 19 |
  | www.yourfilehost.com      |  48 | 16 |
  | www.4shared.com           |  47 | 11 |
  | www.sourceforge.net       |  45 | 14 |
  | www.filefactory.com       |  45 | 11 |
  | www.imeem.com             |  45 | 10 |
  | www.microsoft.com         |  45 |  8 |
  | www.rediff.com            |  44 |  8 |
  | www.flickr.com            |  43 |  8 |
  | www.hi5.com               |  42 | 14 |
  | www.netlog.com            |  42 |  7 |
  | www.metroflog.com         |  41 | 20 |
  | www.reference.com         |  40 | 15 |
  | www.photobucket.com       |  40 | 10 |
  | www.livejasmin.com        |  39 | 10 |
  | www.about.com             |  38 |  6 |
  | www.fastclick.com         |  38 |  4 |
  | www.clicksor.com          |  38 |  3 |
  | www.go.com                |  37 | 11 |
  | www.friendster.com        |  36 |  7 |
  | www.imageshack.us         |  35 | 13 |
  | www.megaupload.com        |  35 |  4 |
  | www.mozilla.com           |  35 |  3 |
  | www.geocities.com         |  34 |  9 |
  | www.depositfiles.com      |  34 |  5 |
  | www.adultfriendfinder.com |  33 |  6 |
  | www.mediafire.com         |  33 |  4 |
  | www.facebook.com          |  30 |  3 |
  | www.wordpress.com         |  29 | 10 |
  | www.badongo.com           |  27 |  7 |
  | www.imdb.com              |  27 |  7 |
  | www.zshare.net            |  24 |  3 |
  | www.adultadworld.com      |  22 |  3 |
  | www.rapidshare.com        |  22 |  3 |
  | www.information.com       |  21 |  2 |
  | www.studiverzeichnis.com  |  20 |  5 |
  | www.realitykings.com      |  17 |  6 |
  | www.easy-share.com        |  15 |  4 |
  | www.youporn.com           |  14 |  9 |
  | www.multiply.com          |  13 |  6 |
  | www.torrentz.com          |  12 |  3 |
  | www.imagevenue.com        |  11 |  7 |
  | www.live.com              |  11 |  5 |
  | www.megarotic.com         |  10 |  4 |
  | www.soso.com              |   9 |  4 |
  | www.thepiratebay.org      |   9 |  4 |
  | www.blogger.com           |   7 |  4 |
  | www.usercash.com          |   7 |  3 |
  | www.orkut.com             |   6 |  4 |
  | www.brazzers.com          |   6 |  3 |
  | www.google.com            |   6 |  3 |
  | www.craigslist.org        |   5 |  2 |
  | www.google.ca             |   5 |  2 |
  | www.google.co.id          |   5 |  2 |
  | www.google.co.in          |   5 |  2 |
  | www.google.co.th          |   5 |  2 |
  | www.google.co.uk          |   5 |  2 |
  | www.google.co.za          |   5 |  2 |
  | www.google.com.au         |   5 |  2 |
  | www.redtube.com           |   5 |  2 |
  +---------------------------+-----+----+

The data has been generated using the XQuery

  declare variable $url external;
  
  declare function local:isHttpRequest($a as node()) as xs:boolean {
    let $y := $a/ancestor-or-self::packet[1]
    return $y/proto[@name = 'http']/field/@name = 'http.request'
  };
  
  for $f in doc($url)
  return
  ( $url,
    count($f//packet[ local:isHttpRequest(.) ]),
    count( distinct-values( $f//field[@name = 'http.host']/@show  ) ), "
  " )

applied to the output of

  % tshark -o tcp.desegment_tcp_streams:TRUE
           -o http.desegment_headers:TRUE
           -o http.desegment_body:TRUE
           -o http.dechunk_body:FALSE
           -o http.decompress_body:FALSE
           -T pdml -r example.pcap "http"

Where example.pcap has been generated through another script that
automates the Opera web browser and the Net::Pcap Perl module. From
launching Opera to terminating it, each site had 60 seconds to load
all its resources, there was no cache or any other thing that might
have interferred with the capture.

regards,
-- 
Björn Höhrmann · mailto:bjoern@hoehrmann.de · http://bjoern.hoehrmann.de
Weinh. Str. 22 · Telefon: +49(0)621/4309674 · http://www.bjoernsworld.de
68309 Mannheim · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/ 
Received on Friday, 6 June 2008 01:56:03 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 7 November 2012 14:18:18 GMT