Re: FYI: Oblivious HTTP from Martin Thomson on 2021-02-01 (ietf-http-wg@w3.org from January to March 2021)

From: Martin Thomson <mt@lowentropy.net>
Date: Mon, 01 Feb 2021 11:29:09 +1100
To: ietf-http-wg@w3.org
Message-Id: <0709f82d-150e-4423-8994-5a2549475ef4@www.fastmail.com>
Thanks David, these are all good points to capture.  I've flagged you on the issues I've created in response to these; though it will take a little bit of work to get this all written down.

For those following along at home: https://github.com/unicorn-wg/oblivious-http/issues

On Sat, Jan 30, 2021, at 12:08, David Benjamin wrote:
> On Thu, Jan 28, 2021 at 7:23 PM Martin Thomson <mt@lowentropy.net> wrote:
> > I think that it might be best to respond with a little more context on what I believe the potential application of oblivious HTTP would be.  The draft doesn't really go into that, because that isn't really a good place to capture these sorts of things.
> > 
> > Just to set this aside, I don't see this as building toward replicating something like Tor.  There are obvious parallels, but the two approaches have very different assumptions about trust and the incentives of various actors.  Tor, as a system, is also far more complex and ambitious.  So, by all means look for parallels in Tor, but understand that this has very different models for both the threats it considers and the environment it might be deployed in.
> > 
> > The other thing that I think is important to understand is that - at least from my perspective - the goal is not to carry any significant proportion of HTTP requests.  For instance, I don't see this being used for web browsing.  If we're willing to echo cookies, then you can safely assume that we won't be making those requests oblivious.  And many other cases benefit from being able to establish some sort of continuity, if only to deal with denial of service risks.  Each application would have to be assessed on its own.
> > 
> > The things that we're talking about using this are those cases where we have identified a privacy risk associated with having a server being able to link requests.  The original case in research was DNS queries, where it has been shown that building profiles of users based on their DNS activity has poor privacy properties.  At Mozilla, we're also considering this style of approach in other places that browsers make requests with information that might be sensitive, like telemetry reporting.
> > 
> > There are non-trivial costs associated with setting this up.  As a proxy needs to be run by a separate entity, but they don't see any direct benefit from the service they provide, you have to arrange for their costs to be met somehow.  You need to do so in a way that the server can ensure that the proxy is not enabling DoS attacks, while also retaining sufficient independence that clients can trust the proxy.  This is harder as the use cases become more general, but we believe that this can be arranged for certain specific cases.
> > 
> > Does the explanation about applicability help?  I realize now that I shouldn't have left this up to inference, and the draft should probably at least address the point directly, so I'll make sure that the next version does something about that.
> 
> I agree the draft should talk about this. I initially read it as 
> intending a general replacement for proxying strategies, which seemed 
> odd. As you note, in more stateful contexts like web browsing, the 
> correlation boundaries are so much larger than a request that this 
> probably doesn't buy much over simply partitioning connection pools. 
> Whereas I could see applications with more standalone requests wanting 
> different tradeoffs.
> 
> One comment on the privacy properties here: the requests are only as 
> uncorrelated as the key configurations used by the client. In the 
> online HTTPS fetch model described here, if each client independently 
> fetches, the server could maintain many configs and serve a different 
> one each time. I assume that the client will cache the key 
> configuration (otherwise it needs a new fetch per request, at which 
> point it may as well just tunnel HTTPS), which means clients can be 
> identified, or partially identified by their keys. This is helped a bit 
> by the small keyID size: the server gets 8 bits plus a couple more from 
> algorithm selections, and then the rest must come from trial 
> decryption. But how well this mitigates it depends on volume and 
> whether this could be combined with information in the requests 
> themselves. (E.g. telemetry reports may reveal platform information or 
> rough performance characteristics.) That's probably worth some text, 
> both on the privacy implications of the suggested configuration model, 
> and on privacy implications of configuration models in general.
> 
> Alternatively, the proxy could cache and serve the key configuration. 
> Then everyone behind the same proxy uses the same config. Though you'd 
> then need some kind of offline authentication on the config, such as 
> SXGs.
> 
> I think the draft should also discuss the security implications a bit 
> more, since it's replacing the guarantees you'd otherwise get from 
> proxying HTTPS. Some things that came to mind:
> 
> - Related to all the online vs offline signature excitement, the client 
> probably needs to enforce a time bound on key configurations, in order 
> to contain the effects of a temporary compromise.
> 
> - Like TLS 1.3 0-RTT, encapsulated requests don't have replay 
> protections and lack forward secrecy. This shouldn't be used where 
> replay protection matters, and key configurations should be rotated 
> frequently.
> 
> - Unlike TLS 1.3 0-RTT, encapsulated responses also lack forward 
> secrecy. (If it's an issue, I suppose you could fix this by doing a 
> second HPKE in the other direction and binding the two together.)
> 
> David
Received on Monday, 1 February 2021 00:29:41 UTC