- From: Bjoern Hoehrmann <derhoermi@gmx.net>
- Date: Wed, 11 Jun 2008 05:24:59 +0200
- To: "Anne van Kesteren" <annevk@opera.com>
- Cc: "WAF WG (public)" <public-appformats@w3.org>, public-webapps@w3.org
* Anne van Kesteren wrote: >The feature avoids the overhead you get when you need to issue 10 POST >requests to 10 distinct URIs on the same server scoped to some path. >Without Acess-Control-Policy-Path that requires 20 requests. With >Access-Control-Policy-Path it requires 12. So for the N requests you want >to make it roughly safes you of N additional requests for larger values of >N. Consider you have a simple echo protocol, a client sends a line to the server and the server returns the same line to the client; then they repeat until m bytes have been transferred. What does it matter how many lines these m bytes have been split into? It does not, and it does not matter much with HTTP either if you have persistent connections, it's just a bit more complicated to find the end of a message. So let's look at the traffic instead. The minimal OPTIONS request and response look somewhat like this: +---------------------------+ +---------------------------+ | OPTIONS /example HTTP/1.1 | | HTTP/1.1 200 | | Host:example.org | ===> | Access-Control:allow <*> | | Origin:example.net | <=== | Content-Length:0 | | | | | +---------------------------+ +---------------------------+ --- 128 byte --- So let's assume the average OPTIONS request generates three times as much (you may want to include a Data or User-Agent header, for example). The POST requests and responses are quite a bit longer, you would send various Accept headers, Content-Type, Set-Cookie, whatever, so let's assume an average of 7*128 bytes protocol overhead per transaction: 80% ++-----------+-----------+------------+-----------+---·········· + + + +OPTIONS exchange xxxxxx + 70% oo ········Protocol Overhead oooooo++ 60% ++oo ······· Message Bodies ······++ | oo ····· | 50% ++ ooo ···· ++ 40% ++ oooo· ++ | ··· ooooo | 30% xx ··· ooooooo ++ | xxxxx oooooooooooo | 20% ++ ·· xxxxxxxxxx ooooooooooooooooooooo ++ 10% ++·· xxxxxxxxxxxxxxxxxxxxx ooooooo +· + + xxxxxxxxxxxxxxxxxxxxxxxxxxx 0% ·+-----------+-----------+------------+-----------+-----------++ 0 1000 2000 3000 4000 5000 Average size of message bodies in bytes (request plus response). If your average request and response body are together less than about 2000 bytes, it would be silly to use many POSTs to different resources and care a lot about optimizing the OPTIONS overhead away, you would gain much more by using fewer POSTs which would then eliminate OPTIONS overhead aswell. And if the average size exceeds about 3000 bytes, you would gain only very little, almost all resources are spent processing the POST requests. (Calculating how much you might save in terms of "load" on the server is more difficult than this simple model, because you have to consider how the server handles concurrent and subsequent requests and consider the cost of creating e.g. new threads and new processes; that's very specific to the web server software, operating system, their configu- ration, and even the hardware they are running on, and you might pick slightly different numbers than I have; but the conclusion is similar in all reasonable cases). Note that you can always avoid the overhead of cross site posts using cross document messaging and same origin posts, the target just needs to install a suitable web page that accepts and dispatches the requests. At increased risk of course, but also with increased flexibility and very likely better performance compared to cross site requests, whether utilizing Access-Control-Policy-Path or not. Less is always less of course, but let's look at how many requests per page load is considered normal among the homepages of the the Alexa 100 sites (www-archive has the raw data including methodology): +---------------+---------------+---------------+---------------+ 16 *** + Number of sites in this range ******++ 14 *+* ******* ++ 12 *+* * * * ++ 10 *+* * * ***** ++ 8 *+******* * * * ++ 6 *+* * * * * * ++ 4 *+* * * * * ************* ++ 2 *+* * * * * * * * * ***************** ++ 0 ***************************************************************** +---------------+---------------+---------------+---------------+ 0 50 100 150 200 Number of requests when loading the front page. That is an average of about fifty requests and a median of about fourty. For the nba.com website you have to wait for over 200 requests to fully load the page, and this isn't in response to any user-initiated data submission. I am afraid I do not see much need for this misdesigned op- timization. Certainly not in the first version of this specification. >Ian was one of the persons who proposed this feature and he doesn't think >it's worthwhile to have it if it's scoped to the entire triple (just >allowing the / value for instance). I believe Ian initially didn't think "preflight" requests were necessary for POST requests to begin with, then he thought the additional requests weren't much of an issue and require no optimization, then they became "somewhat painful", and I think the latest is "high cost". It seems wise then to use some reasoning instead of relying on his opinion. If none of your scripts on the host is secured against malicious cross site requests, you should not be using the specification's features at all. If all your scripts are secured (I am /assuming/ that is possible), then there would be no problem scoping it on the whole host (you'd have to worry about denial of service perhaps, but you have that either way). So Ian would seem to be saying this feature is only really useful if you mix secured and unsecured scripts on the same host. Now clearly you have your request savings whether or not you've secured your other scripts, so either those savings are not what makes the feature worthwhile, or it is not actually possible to secure those other scripts (and you cannot simply put the cross site scripts on their own host). Now I don't know which it is, perhaps the header is really only meant as placebo for people irrationally afraid of seeing many OPTIONS requests in their server logs, or perhaps it is expected that the 'Origin' header will be filtered out frequently in which case you probably cannot tell same-origin and cross-site requests apart. There are many possibilities, but right now Ian's stance, as you relay it anyway, seems rather silly. -- Björn Höhrmann · mailto:bjoern@hoehrmann.de · http://bjoern.hoehrmann.de Weinh. Str. 22 · Telefon: +49(0)621/4309674 · http://www.bjoernsworld.de 68309 Mannheim · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/
Received on Wednesday, 11 June 2008 03:25:38 UTC