Re: Workload for surrogates

From: hardie@equinix.com
Date: Mon, Aug 21 2000

  • Next message: Mark Nottingham: "Re: Workload for surrogates"

    From: hardie@equinix.com
    Message-Id: <200008212047.NAA25681@nemo.corp.equinix.com>
    To: mnot@akamai.com (Mark Nottingham)
    Date: Mon, 21 Aug 2000 13:47:05 -0700 (PDT)
    Cc: douglis@research.att.com (Fred Douglis), hardie@equinix.com, surrogates@equinix.com, www-wca@w3.org
    Subject: Re: Workload for surrogates
    
    Mark writes:
    >  Characterizing CDNs is shooting at a moving target; each one is
    > going to take a different tack at how it handles objects, and what
    > gets routed through the CDN. so-called "edge processing" throws
    > another wrench into the works, as each one will have a different
    > approach.
    
    I agree, but my question isn't so much about characterizing CDNs,
    but characterizing the workload of CDNs.  For example, most CDNs
    are limited to cachable objects (certainly all surrogate system
    CDNs are limited to cachable objects).  Given that, what are
    hit rates against the CDNs' cache like?  Is a 40% hit rate typical,
    as it would be for a Squid cache in the wild, or is 90% more
    typical?  What are the distributions of things like number of
    hits per object, time in cache, etc?
    
    I don't want anyone to give away their secret sauce on this stuff,
    such as describing the exact algorithms for cache filling, but
    some general sense of what the workload edges are like would
    be a big help.
    				regards,
    					Ted Hardie
    
    
    
    
    > 
    > Of course, all surrogates are not used in CDNs; "reverse proxies" are
    > somewhat widely deployed on popuular sites (that's a feeling; I haven't seen
    > much data to support it).
    > 
    > I'd think two workloads would be in order; one to represent a surrogate in
    > front of an entire, "typical" site, and one to represent a CDN that handles
    > all cacheable objects. 
    > 
    > These would only give rough figures, of course, but that's about the best
    > that can be done IMHO, and they would still be useful to compare.
    > 
    > 
    > On Thu, Aug 17, 2000 at 09:33:41AM -0400, Fred Douglis wrote:
    > > [Cross-posted to wca list from surrogates list.  Original message attached.]
    > > 
    > > Ted,
    > > 
    > > Regarding CDNs, since in general there is a tendency toward emphasizing 
    > > cachable content up front, I would expect the workload to be somewhat 
    > > different from a typical origin server, at least one that has any significant 
    > > dynamic data.  There'll be a greater fraction of hits to more static content 
    > > such as gifs.  
    > > 
    > > I know there's been some work on benchmarking CDNs but I don't know if there's 
    > > been work on characterizing behavior.  The W3C workload characterization group 
    > > might be looking into this -- have you asked around there?  I'm cc-ing them 
    > > here.  Perhaps someone will have additional info.
    > > 
    > > Regards,
    > > 
    > > Fred
    > 
    > Content-Description: 4
    > > Date: Wed, 16 Aug 2000 14:00:08 -0700 (PDT)
    > > From: hardie@equinix.com
    > > Reply-To: hardie@equinix.com
    > > To: surrogates@equinix.com
    > > Cc: hardie@equinix.com (ted hardie)
    > > Subject: Workload for surrogates
    > > Delivery-Date: Wed Aug 16 17:15 EDT 200
    > > X-Mailer: ELM [version 2.5 PL3]
    > > 
    > > As part of the testing of the Bellwether surrogates implementation,
    > > Duane and I have been talking about what the workload for a demand
    > > driven surrogate would look like.  We've been testing both single
    > > surrogate and load balanced surrogates using a polygraph workload that
    > > presumes a fairly high hit rate.  This derives from our model of a
    > > surrogate that gets invoked by periods of high demand for a limited
    > > data set (the flash crowd/CNN event effect typical for a Starr report
    > > release).
    > > 
    > > Thinking about this, though, I've been wondering what other types of
    > > workloads might be common.  CDNs seem likely to have a workload
    > > close to an origin server, where some demand-driven surrogates might
    > > have a workload like a proxy cache.  
    > > 
    > > Any insight out there on what you expect or what you have seen in
    > > initial deployments?
    > > 			regards,
    > > 				Ted Hardie
    > > 				Equinix, Inc.
    > > 
    > 
    > 
    > -- 
    > Mark Nottingham, Research Scientist
    > Akamai Technologies (San Mateo, CA)
    >