Re: I-D Action: draft-pauly-httpbis-geoip-hint-01.txt from Ben Schwartz on 2024-10-24 (ietf-http-wg@w3.org from October to December 2024)

From: Ben Schwartz <bemasc@meta.com>
Date: Thu, 24 Oct 2024 22:06:58 +0000
To: David Schinazi <dschinazi.ietf@gmail.com>
CC: Ted Hardie <ted.ietf@gmail.com>, Stephen Farrell <stephen.farrell@cs.tcd.ie>, Watson Ladd <watsonbladd@gmail.com>, "ietf-http-wg@w3.org" <ietf-http-wg@w3.org>
Message-ID: <SA1PR15MB43708BB89C1EEF961891B214B34E2@SA1PR15MB4370.namprd15.prod.outlook.com>
My point is that the distinction between browser-affiliated privacy proxies and third-party proxies here is artificial.  Both kinds of proxy operators commonly try to encode a coarse location for the client in the selected egress IP.  Both kinds of proxy operators are regularly disappointed by the IP->location results produced for their egress IPs in third-party databases.  Both kinds of proxy operators could easily operate a service that would inform the browser of the geolocation that web servers ought to see.

This draft proposes a solution that only helps the browser-affiliated proxies.  As I mentioned earlier, this limitation is also artificial, and could be removed by permitting the browser to use DHCP geolocation (RFC 4776) when a VPN is active, defining a .well-known attribute on the proxy domain, or many other ways.

> When both (1) and (2) are enabled at the same time, traffic from the client flows through (2) before it reaches (1), so the IP address used to compute the geo will still be the egress IP from (2).

This is an interesting example that highlights the problem: the IP address used to compute the geo hint will be the egress IP from (2), but the database used to perform that computation will be controlled by (1).  This draft does not provide any way for (2) to provide the "corrected" geoIP database tuned for its egress IPs, so only (1) benefits from "corrected" geoIP answers.

--Ben

________________________________
From: David Schinazi <dschinazi.ietf@gmail.com>
Sent: Thursday, October 24, 2024 4:54 PM
To: Ben Schwartz <bemasc@meta.com>
Cc: Ted Hardie <ted.ietf@gmail.com>; Stephen Farrell <stephen.farrell@cs.tcd.ie>; Watson Ladd <watsonbladd@gmail.com>; ietf-http-wg@w3.org <ietf-http-wg@w3.org>
Subject: Re: I-D Action: draft-pauly-httpbis-geoip-hint-01.txt

Hi everyone, I'm realizing I've been using some terminology without defining it, leading to some confusion. Let's create a distinction between two distinct kinds of IP-hiding technologies. 1) privacy proxies. Examples of these include

Hi everyone,

I'm realizing I've been using some terminology without defining it, leading to some confusion. Let's create a distinction between two distinct kinds of IP-hiding technologies.

1) privacy proxies. Examples of these include Google's IP Protection and Apple's iCloud Private Relay. These are affiliated with a browser, and integrated pretty tightly with that browser (and/or operating system). The goal of these is to prevent websites from having access to the user's IP address, because that represents a stable tracking identifier. However, these privacy proxies do not try to hide the user's coarse location. They look at the client's IP address, map that to a city (for Google, we map it to the closest grouping of 500'000 people for example), and then the privacy proxy picks an egress IP address that's registered to that city in a public geofeed. While websites have lost the ability to see the client's IP address, they can still access the client's coarse location. Note that this coarseness is often configurable by the user.

2) VPNs and other proxies. These are generally not affiliated with the browser, and are implemented outside of the browser. Some of these intentionally masquerade the client's location. For example, some VPN providers allow the client to pick a country, and then use egress IPs that are mapped to that country in a public geofeed.

draft-pauly-httpbis-geoip-hint is mainly targeted at (1). We've found that the majority of users do want locally-relevant content. If the privacy proxy were to use egress IPs with widely off location, many users will disable proxying to revert to the browsing experience that they're used to. Because of this, it's a real requirement for privacy proxies to have the option to provide websites with accurate coarse location.

When (2) is enabled, that generally happens underneath the browser. The browser just makes regular requests, and they get routed through the VPN without the browser needing to do anything special or even know that this is happening. For other proxies, the proxy client is in the browser so the browser has knowledge of this happening, but it still routes traffic through the other proxies, similar to a VPN.

I realize that the text in the draft that refers to proxies and VPNs isn't really clear. That text is meant to focus on (2). When (2) is enabled, we absolutely need to use the egress IP from server (2) to determine the location. That ensures that if the client is using the VPN to pretend they're in a different country, Sec-CH-IP-Geo will match the intended country. However, for (1), we want to use the client's IP address as seen by the privacy proxy. That allows us to have a more coarse location in the privacy proxy egress IP geofeed, and then have the header convey the level of granularity desired by the user. When both (1) and (2) are enabled at the same time, traffic from the client flows through (2) before it reaches (1), so the IP address used to compute the geo will still be the egress IP from (2).

Hope this helps,
David

On Thu, Oct 24, 2024 at 11:03 AM Ben Schwartz <bemasc@meta.com<mailto:bemasc@meta.com>> wrote:
> Ben > If implemented today, this proposal would privilege vendor-provided proxies over third-party proxies.  This seems undesirable to me.

> As it stands today, no browser offers open APIs to switch out their privacy proxy provider.

This is only true in some trivial sense.  Chrome [1] and Android [2] both offer public APIs that allow third parties to control the proxy settings, and Chrome defers to the OS proxy settings as well.  Tens (hundreds?) of millions of users use non-browser-affiliated proxy or "VPN" services to protect their privacy.

> In particular, you'd need to formalize exactly what kind of blinded tokens are used, and a very high number of knobs such as token refresh rate, proxy failure backoff timers, etc.

No, that is all internal to the proxy/VPN operator's account management logic.  There's no need for browser customization there.

Today, an app downloaded from an app store can provide reasonably equivalent functionality to a proxy offered by the browser vendor, from the user's perspective.  This proposal would allow an improved user experience only for the browser vendor's own proxy service.

--Ben

[1] https://developer.chrome.com/docs/extensions/reference/api/proxy<https://urldefense.com/v3/__https://developer.chrome.com/docs/extensions/reference/api/proxy__;!!Bt8RZUm9aw!5B_4w5nKM662zSNwctvyr-e9zoii4W5wHCDstlfnw2F2UjXnpDdWibDGBPFDA-CmqHGjc3fRgt7i3nQm8EiA$>
[2] https://developer.android.com/reference/android/net/VpnService.Builder#setHttpProxy(android.net.ProxyInfo)<https://urldefense.com/v3/__https://developer.android.com/reference/android/net/VpnService.Builder*setHttpProxy(android.net.ProxyInfo)__;Iw!!Bt8RZUm9aw!5B_4w5nKM662zSNwctvyr-e9zoii4W5wHCDstlfnw2F2UjXnpDdWibDGBPFDA-CmqHGjc3fRgt7i3qS2ouOI$>
Received on Thursday, 24 October 2024 22:07:11 UTC