- From: Marshall Eubanks <marshall.eubanks@gmail.com>
- Date: Mon, 9 Apr 2012 11:35:59 -0400
- To: Eric Rescorla <ekr@rtfm.com>
- Cc: rtcweb@ietf.org, public-webrtc@w3.org
I really like this analysis. Some questions. 2012/4/9 Eric Rescorla <ekr@rtfm.com>: > Hi folks, > > Since it seems like we're going to be having a large number of > interims, I thought it might be instructive to try to analyze a bunch > of different locations to figure out the best strategy. My first cut > analysis is below. > > Note that I'm not trying to make any claims about what the best set of > venues is. It's obviously easy to figure out any statistic we want > about each proposed venue, but how you map that data to "best" is up > to you. In particular, there's some tradeoff between minimal total > travel time and a "fair" distribution of travel times (not that I > claim to know what that means). > > > METHODOLOGY > The data below is derived by treating both people and venues as > airport locations and using travel time as our primary instrument. > > 1. For each responder for the current Doodle poll, assign a home > airport based on their draft publication history. We're missing a > few people but basically it should be pretty complete. Since > these people responded before the venue is known, it's at > least somewhat unbiased. > > 2. Compute the shortest advertised flight between each home airport > and the locations for each venue by looking at the shortest > advertised Kayak flights around one of the proposed interim > dates (6/10 - 6/13), ignoring price, but excluding "Hacker fares". > [Thanks to Martin Thomson or helping me gather these.] > 1.) Why are some fields doubled ? I.e., ARN SFO 14 13 Are these counted twice ? That would, of course, give more weight to those records. 2.) At any rate, I couldn't quite match your numbers. For SFO, for example, I got # SFO Records 29 | Mean 12.52 | RMS 15.34 | Std Dev 8.55 | Minimum 1.00 | Maximum 34.00 | This assumes that each doubled entry counts as 2 separate entries. If the second entries are ignored, I get # SFO Records 21 | Mean 14.05 | RMS 17.05 | Std Dev 9.14 | Minimum 1.00 | Maximum 34.00 | If two entries are averaged together (when present) # SFO Records 21 | Mean 13.93 | RMS 16.97 | Std Dev 9.18 | Minimum 1.00 | Maximum 34.00 | None of these 3 options match your Venue Mean Median SD ---------------------------------------------- SFO 13.5 11 12.2 In particular, your SD value seems high. (Note, I use the SD = root mean square /(n-1) not / n convention, but that won't explain the difference. ) Regards Marshall > This lets us compute statistics for any venue and/or combination > of venues, based on the candidate attendee list. > > The three proposed venues: > > - San Francisco (SFO) > - Boston (BOS) > - Stockholm (ARN) > > Three hubs not too distant from the proposed venues: > > - London (LHR) > - Frankfurt (FRA) > - New York (NYC) [0] > > Also, Calgary (YYC), since the other two chair locations (BOS and SFO) > were already proposed as venues, and I didn't want Cullen to feel > left out. > > > RESULTS > Here are the results for each of the above venues, measured in total > hours of travel (i.e., round trip). > > Venue Mean Median SD > ---------------------------------------------- > SFO 13.5 11 12.2 > BOS 12.3 11 7.5 > ARN 17.0 21 10.7 > FRA 14.8 17 7.3 > LHR 13.3 14 7.5 > NYC 11.5 11 5.8 > YYC 14.9 13 10.2 > SFO/BOS/ARN 14.3 13 3.6 > SFO/NYC/LHR 12.7 11.3 3.7 > > XXX/YYY/ZZZ a three-way rotation of XXX, YYY, and ZZZ. Obviously, mean > and median are intended to be some sort of aggregate measure of travel > time. I don't have any way to measure "fairness", but SD is intended > as some metric of the variation in travel time between attendees. > > The raw data and software are attached. The files are: > > home-airports -- the list of people's home airports > durations.txt -- the list of airport-airport durations > doodle.txt -- the attendees list > pairings.py -- the software to compute travel times > doodle-out.txt -- the computed travel times for each attendee > > Obviously, there could be an error in the raw data or the software. > Please feel free to send corrections, especially if you find > something material. > > > OBSERVATIONS > Obviously, it's hard to know what the optimal solution is without > some model for optimality, but we can still make some observations > based on this data: > > 1. If we're just concerned with minimizing total travel time, then we > would always in New York, since it has both the shortest mean travel > time and the shortest median travel time, but as I said above, this > arguably isn't fair to people who live either in Europe or California, > since they always have to travel. > > 2. Combining West Coast, East Coast, and European venues has > comparable (or at least not too much worse) mean/median values than > NYC with much lower SDs. So, arguably that kind of mix is more fair. > > 3. There's a pretty substantial difference between hub and non-hub > venues. In particular, LHR has a median travel time 7 hours less than > ARN, and the SFO/NYC/LHR combination has a median/mean travel time > about 2 hours less than SFO/BOS/ARN (primarily accounted for by the > LHR/ARN difference). [Full disclosure, I've favored Star Alliance hubs > here, but you'd probably get similar results if, for instance, you > used AMS instead of LHR.] > > > Obviously, your mileage may vary based on your location and feelings > about what's fair, but based on this data, it looks to me like a > three-way rotation between West Coast, East Coast, and European hubs > offers a good compromise between minimum cost and a flat distribution > of travel times. > > Personally, whatever we decide to do I'd ask that the WG settle now on > a pattern going forward so that we can predictably budget our travel > time and dollars. > > > [0] Treating all three NYC airports as a single location. > > _______________________________________________ > rtcweb mailing list > rtcweb@ietf.org > https://www.ietf.org/mailman/listinfo/rtcweb >
Received on Tuesday, 10 April 2012 13:00:28 UTC