- From: Ian Hickson <ian@hixie.ch>
- Date: Sun, 21 Oct 2007 07:51:36 +0000 (UTC)
- To: Bjoern Hoehrmann <derhoermi@gmx.net>
- Cc: public-appformats@w3.org
On Sun, 21 Oct 2007, Bjoern Hoehrmann wrote: > > > > Referer-Root gets sent with every request; how would it be used to > > distinguish the two? > > I didn't think you would. I take it that would again be motivated by the > list-too-long case, but sending the header on every request very poorly > accounts for that. The content of the header is highly sensitive, if you > make a couple of popular web services that anyone can embed, you'd get a > steady stream of information about what web sites people visit When two sites cooperate, they can already share such information. > including intranet sites that make use some of the services. Clearly an Intranet site's development team should take care not to leak information that they desire to keep secret -- just like today, where any link to an external site, image, script, etc, can leak information. > That you don't get to know exactly what pages they visit makes little > difference. So sending it out so gratiously would quickly lead people to > filter it out. Oh that some people would end up filtering this out was never in any doubt, just like some people disable scripting. > Further, the whole mechanism would be intransparent: you have no way of > saying a particular access has been "granted" or "denied" on a case by > case basis, which is especially important in the access check request > case. The worst you can get is a false-negative (denial when it should be allowed), and merely adding a Vary: header with the appropriate values will remove that problem. Certainly it would be helpful for the spec to advise here, but it's hardly a blocking issue. > In Anne's cache the Referer-Root is not actually stored there, leading > to false positives if you do not prevent caching of the result, or to > false negatives if you do add it. Then we should store the Referer-Root as part of the cache key, as has already been suggested. > Preventing caching is not really possible either, at best you could try > to set a bogus expiration date to circumvent your > keep-it-cached-for-a-little-bit policy. If such a caching policy is desired, it seems reasonable to do that. We haven't heard from any site developers suggesting they need this for a real site. > The draft does not actually mention case-by-case policies as an option, > so it would seem we have an overly intrusive, wasteful, awkward, under- > documented mechanism that's supposed to simplify some minority of cases. Actually it's quite an important case for large developers. > Those cases seem way too few to simplify them and I have doubts they're > really being simplified by Referer-Root; but I am happy to read up on a > detailed analysis of the problem meant to be solved here, if anyone can > point me to it. That'd also help to come up with alternate solutions if > they are actually needed. I don't believe any detailed analyses have ever been published. The case is simply a third-party service site with many customer (second-party) sites that wish to use the service, but where the services are only available on a per-license basis. For each service request, the site needs to know the requesting domain (information which a hostile fourth-party site can't fake) to determine whether or not the request is from the page of a valid second-party site. (Billing can also be managed in this way.) > > I thought the proposal was to have a separate cache (non-HTTP) for the > > pre-flight test requests. > > Yes, and the response to that request, as well as ordinary requests, may > be served from a cache, which may very well serve the wrong variant if > you don't properly indicate that there are multiple variants and prevent > caching of the variant as necessary. Authors usually implement this in- > correctly, especially if they are trying to minimize the number of times > the document is fetched as they would in your expensive-to-compute case > (or the more general dont-want-to-waste-traffic case due to using GET.) That kind of caching would be fine, actually -- the whole point here isn't that we are worried about the response being cached, it's that we're worried about the response not being cached _enough_. Your concern boils down to our original proposal, wherein we were happy with the over-caching risks, and our only issue was that HTTP rules would cause the cache to flush too easily (namely, immediately after the request each time). > >Actually the "Allow" line above is directly copied and pasted from the > >response headers sent in response to an OPTIONS to this URI: > > > > http://software.hixie.ch/utilities/cgi/test-tools/echo > > > >...which is a CGI script. > > Very old versions of the Apache server don't let mod_cgi scripts handle > OPTIONS requests without additional configuration. That was changed two > years ago, and you have a range of other options to satisfy the request, > such as using mod_perl or mod_php, mod_headers, mod_rewrite, etc. Very old my installation might be, but it's the standard installation on one of the Web's biggest shared hosting providers, so it's an installation we presumably should consider. > >The XML PIs are an important convenience because they work in all other > >cases for this API and consistency here is important (consistency in > >security APIs is critical -- anything surprising in security APIs will > >almost always lead to security holes). > > Access check requests are completely different from all other cases; the > purpose, from my perspective, of the processing instruction is to hinder > hostile applications from reading confidential data from a document. Now > with access check requests you never make the document you get available > to hostile applications, so reading it makes little sense to me. > > Consistency and avoiding surprises is certainly important. If you want > to find out what access options are available for a resource, you use > the OPTIONS method to find out, certainly you would in the same origin > case. Using GET instead is a surprise. The Allow header is also usually > only used with OPTIONS, using it with GET instead is another surprise. > You actually have Method-Check to work around adverse effects of these > surprises. > > During the access check request, you are also only interested what the > access options for one particular resource are. Above you indicate you > would follow redirects; doing that can only tell you something about the > options available for other resources, so this is yet another surprise. > Of course you probably follow redirects in "all other cases", so that > might explain that, though I don't quite see how it'd work (may I post > to X? Please go to Y instead. May I post to Y? Yes. Then post to X!?) I understand that you think these "surprises" are not important or surprising than the ones I raised concern about, but I disagree with that position. > >Sending the information in the "ultimate request" seems like it would > >make it somewhat difficult for the original request to have the > >information, which is required, as I explained in my last e-mail. > > The information is redundant at that point. The purpose of the access > check request is to find out what the server is prepared to handle. If > it answers accurately, there cannot be a problem afterwards; in the > worst case it would tell the client to proceed and reject the following > request. Generally, it will have to do that anyway. We wish to have the optimisation that the access-check is done before the POST submission (as well as after, to control read access). -- Ian Hickson U+1047E )\._.,--....,'``. fL http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Received on Sunday, 21 October 2007 07:51:54 UTC