Re: FYI: Oblivious HTTP

On Fri, Jan 29, 2021 at 5:11 PM David Benjamin <davidben@chromium.org>
wrote:

> On Thu, Jan 28, 2021 at 7:23 PM Martin Thomson <mt@lowentropy.net> wrote:
>
>> I think that it might be best to respond with a little more context on
>> what I believe the potential application of oblivious HTTP would be.  The
>> draft doesn't really go into that, because that isn't really a good place
>> to capture these sorts of things.
>>
>> Just to set this aside, I don't see this as building toward replicating
>> something like Tor.  There are obvious parallels, but the two approaches
>> have very different assumptions about trust and the incentives of various
>> actors.  Tor, as a system, is also far more complex and ambitious.  So, by
>> all means look for parallels in Tor, but understand that this has very
>> different models for both the threats it considers and the environment it
>> might be deployed in.
>>
>> The other thing that I think is important to understand is that - at
>> least from my perspective - the goal is not to carry any significant
>> proportion of HTTP requests.  For instance, I don't see this being used for
>> web browsing.  If we're willing to echo cookies, then you can safely assume
>> that we won't be making those requests oblivious.  And many other cases
>> benefit from being able to establish some sort of continuity, if only to
>> deal with denial of service risks.  Each application would have to be
>> assessed on its own.
>>
>> The things that we're talking about using this are those cases where we
>> have identified a privacy risk associated with having a server being able
>> to link requests.  The original case in research was DNS queries, where it
>> has been shown that building profiles of users based on their DNS activity
>> has poor privacy properties.  At Mozilla, we're also considering this style
>> of approach in other places that browsers make requests with information
>> that might be sensitive, like telemetry reporting.
>>
>> There are non-trivial costs associated with setting this up.  As a proxy
>> needs to be run by a separate entity, but they don't see any direct benefit
>> from the service they provide, you have to arrange for their costs to be
>> met somehow.  You need to do so in a way that the server can ensure that
>> the proxy is not enabling DoS attacks, while also retaining sufficient
>> independence that clients can trust the proxy.  This is harder as the use
>> cases become more general, but we believe that this can be arranged for
>> certain specific cases.
>>
>> Does the explanation about applicability help?  I realize now that I
>> shouldn't have left this up to inference, and the draft should probably at
>> least address the point directly, so I'll make sure that the next version
>> does something about that.
>>
>
> I agree the draft should talk about this. I initially read it as intending
> a general replacement for proxying strategies, which seemed odd. As you
> note, in more stateful contexts like web browsing, the correlation
> boundaries are so much larger than a request that this probably doesn't buy
> much over simply partitioning connection pools. Whereas I could see
> applications with more standalone requests wanting different tradeoffs.
>
> One comment on the privacy properties here: the requests are only as
> uncorrelated as the key configurations used by the client.
>

Yes. This is like PrivacyPass in that respect.


In the online HTTPS fetch model described here, if each client
> independently fetches, the server could maintain many configs and serve a
> different one each time. I assume that the client will cache the key
> configuration (otherwise it needs a new fetch per request, at which point
> it may as well just tunnel HTTPS), which means clients can be identified,
> or partially identified by their keys. This is helped a bit by the small
> keyID size: the server gets 8 bits plus a couple more from algorithm
> selections, and then the rest must come from trial decryption. But how well
> this mitigates it depends on volume and whether this could be combined with
> information in the requests themselves. (E.g. telemetry reports may reveal
> platform information or rough performance characteristics.) That's probably
> worth some text, both on the privacy implications of the suggested
> configuration model, and on privacy implications of configuration models in
> general.
>
> Alternatively, the proxy could cache and serve the key configuration. Then
> everyone behind the same proxy uses the same config. Though you'd then need
> some kind of offline authentication on the config, such as SXGs.
>

It's worth noting that in a number of these cases (e.g., Telemetry) you'll
want to preconfigure the key into the client, in which case this becomes
less of an issue.

-Ekr

Received on Sunday, 31 January 2021 00:43:18 UTC