Re: FYI: Oblivious HTTP from David Benjamin on 2021-01-30 (ietf-http-wg@w3.org from January to March 2021)

From: David Benjamin <davidben@chromium.org>
Date: Fri, 29 Jan 2021 20:08:19 -0500
To: Martin Thomson <mt@lowentropy.net>
Cc: Roy Fielding <fielding@gbiv.com>, HTTP Working Group <ietf-http-wg@w3.org>
Message-ID: <CAF8qwaCdK6zodL5QGmMSM4U80NdSiZ+KZ16-u2_SypiEo62Ocg@mail.gmail.com>
On Thu, Jan 28, 2021 at 7:23 PM Martin Thomson <mt@lowentropy.net> wrote:

> I think that it might be best to respond with a little more context on
> what I believe the potential application of oblivious HTTP would be.  The
> draft doesn't really go into that, because that isn't really a good place
> to capture these sorts of things.
>
> Just to set this aside, I don't see this as building toward replicating
> something like Tor.  There are obvious parallels, but the two approaches
> have very different assumptions about trust and the incentives of various
> actors.  Tor, as a system, is also far more complex and ambitious.  So, by
> all means look for parallels in Tor, but understand that this has very
> different models for both the threats it considers and the environment it
> might be deployed in.
>
> The other thing that I think is important to understand is that - at least
> from my perspective - the goal is not to carry any significant proportion
> of HTTP requests.  For instance, I don't see this being used for web
> browsing.  If we're willing to echo cookies, then you can safely assume
> that we won't be making those requests oblivious.  And many other cases
> benefit from being able to establish some sort of continuity, if only to
> deal with denial of service risks.  Each application would have to be
> assessed on its own.
>
> The things that we're talking about using this are those cases where we
> have identified a privacy risk associated with having a server being able
> to link requests.  The original case in research was DNS queries, where it
> has been shown that building profiles of users based on their DNS activity
> has poor privacy properties.  At Mozilla, we're also considering this style
> of approach in other places that browsers make requests with information
> that might be sensitive, like telemetry reporting.
>
> There are non-trivial costs associated with setting this up.  As a proxy
> needs to be run by a separate entity, but they don't see any direct benefit
> from the service they provide, you have to arrange for their costs to be
> met somehow.  You need to do so in a way that the server can ensure that
> the proxy is not enabling DoS attacks, while also retaining sufficient
> independence that clients can trust the proxy.  This is harder as the use
> cases become more general, but we believe that this can be arranged for
> certain specific cases.
>
> Does the explanation about applicability help?  I realize now that I
> shouldn't have left this up to inference, and the draft should probably at
> least address the point directly, so I'll make sure that the next version
> does something about that.
>

I agree the draft should talk about this. I initially read it as intending
a general replacement for proxying strategies, which seemed odd. As you
note, in more stateful contexts like web browsing, the correlation
boundaries are so much larger than a request that this probably doesn't buy
much over simply partitioning connection pools. Whereas I could see
applications with more standalone requests wanting different tradeoffs.

One comment on the privacy properties here: the requests are only as
uncorrelated as the key configurations used by the client. In the online
HTTPS fetch model described here, if each client independently fetches, the
server could maintain many configs and serve a different one each time. I
assume that the client will cache the key configuration (otherwise it needs
a new fetch per request, at which point it may as well just tunnel HTTPS),
which means clients can be identified, or partially identified by their
keys. This is helped a bit by the small keyID size: the server gets 8 bits
plus a couple more from algorithm selections, and then the rest must come
from trial decryption. But how well this mitigates it depends on volume and
whether this could be combined with information in the requests themselves.
(E.g. telemetry reports may reveal platform information or rough
performance characteristics.) That's probably worth some text, both on the
privacy implications of the suggested configuration model, and on privacy
implications of configuration models in general.

Alternatively, the proxy could cache and serve the key configuration. Then
everyone behind the same proxy uses the same config. Though you'd then need
some kind of offline authentication on the config, such as SXGs.

I think the draft should also discuss the security implications a bit more,
since it's replacing the guarantees you'd otherwise get from proxying
HTTPS. Some things that came to mind:

- Related to all the online vs offline signature excitement, the client
probably needs to enforce a time bound on key configurations, in order to
contain the effects of a temporary compromise.

- Like TLS 1.3 0-RTT, encapsulated requests don't have replay protections
and lack forward secrecy. This shouldn't be used where replay protection
matters, and key configurations should be rotated frequently.

- Unlike TLS 1.3 0-RTT, encapsulated responses also lack forward secrecy.
(If it's an issue, I suppose you could fix this by doing a second HPKE in
the other direction and binding the two together.)

David
Received on Saturday, 30 January 2021 01:08:49 UTC