- From: Tommy Pauly <tpauly@apple.com>
- Date: Thu, 24 Oct 2024 09:04:03 -0700
- To: Ted Hardie <ted.ietf@gmail.com>
- Cc: David Schinazi <dschinazi.ietf@gmail.com>, Stephen Farrell <stephen.farrell@cs.tcd.ie>, Watson Ladd <watsonbladd@gmail.com>, Ben Schwartz <bemasc@meta.com>, "ietf-http-wg@w3.org" <ietf-http-wg@w3.org>
- Message-id: <CCCAD914-7C94-4328-9EA0-153EAE4F2EC6@apple.com>
Thanks for these comments, Ted. I definitely agree that this danger of misuse is the primary concern. As I am listed as an author here, I do indeed want to see some progress in this area for (a) reducing the barriers to broad adoption of IP privacy techniques and (b) decreasing the ecosystem’s reliance on geo IP databases. However, I don’t think the technical solution in this version is yet sufficient to focus the usage. I think additional MUSTs here represent the correct intents, but as you point out, they are aspirational as they are not enforced by a technical mechanism. The interesting question in my mind is if there is a way to add more assurances. As an aside, for the VPN-to-another-country case, I think the intent would be that the geo hint would be based on matching the egress IP of the VPN or proxy only, so the hint would reflect the user’s intent of using the VPN, not their home network. When we look at the ecosystem today, we could draw things out as involving four parties, commonly: 1. The client 2. The client’s IP provider (ISP, VPN, Proxy); this entity often has some relationship with the client, including one where the client is paying the provider for access through their network 3. The server 4. A geo IP database provider, often a service that is paid for by the server, but is not the authoritative source for the IP information, but a middle-entity The pain points generally come from the relationship of 3 & 4 — being incorrect, out-of-date, conflicting with the information provided by 2, etc. Ideally, the hint that this draft is working on can start to reduce the ecosystem to just 1+2+3. The question is if it can be limited to only being equivalent to the information already implicit in 2’s selection of IP address. Two examples of broad ideas (that need concrete details to actually work correctly) are: - The client hint about the geo-mapping for its IP would only be readable by the server if the server already knows the corresponding IP. Essentially, tie the geo information to be from the IP provider if and only if that is the IP that the server already sees - The server could tell the client “I see you coming from IP address X, tell me how to correctly map that to a location”, and the client could provide an entry that is inherently tied to the IP provider’s authoritative database Thanks, Tommy > On Oct 24, 2024, at 4:58 AM, Ted Hardie <ted.ietf@gmail.com> wrote: > > Hi David, > > Thanks for the responses. > > As before I believe the authors are acting in good faith, and I understand at least to some degree how your goals fit into the deployments with which you are concerned. > > This building block, however, won't get used in just those deployments once it is standardized and widely available. As a simple example, a deployment in which the VPN provider has an out-of-date geo-ip database compared to the large scale service the user is talking to is certainly possible. > > I also don't think you are necessarily parsing the user intent correctly here when they combine an HTTP-speaking app with a VPN or privacy proxy. Let's say someone has the app of a national TV service; they use a VPN to access it when they do not happen to be in the national territory. If the APP can send this hint based on a geo-ip service provided by the national TV service, then the user's wishes are frustrated, since they joined a network within the coverage range via the VPN. The desire to geofence these services is, at the base, related to licensing regimes that the IETF has no control over, so we probably shouldn't debate the use case over-much. But I do think that kind of use demonstrates that there will be other uses of this if standardized. > > Ultimately, you are trying to limit the use of this to a single type of system by using the standard to constrain it. In a controlled system or system-engineering style SDO, that kind of works. But I don't think that MUSTs that can't be guaranteed on the wire or by all the parties are going to preserve anyone's privacy here. > > Thanks again for the response, > > Ted > > > > > > On Wed, Oct 23, 2024 at 10:00 PM David Schinazi <dschinazi.ietf@gmail.com <mailto:dschinazi.ietf@gmail.com>> wrote: >> Hi folks, and thanks for the input! I'm going to try to respond to various comments and questions below. >> >> > Stephen > From my POV, the situation with the web and location exposure is currently just awful and getting worse over time. (That last isn't based on explicit measurement.) ISTM that before adding more ways in which people's location can be abused, we ought sit back and see if there's a real improvement (for people, not services) to be found or not. >> >> Today, many websites log the IP address of clients that visit them. They use that information for many purposes, one of them is knowing where the user is. I agree that the privacy properties of this mass collection are pretty terrible, but no one on this thread has the power to magically make that stop. What we're doing with this proposal (and the corresponding privacy proxies that it is paired with) is to prevent the pervasive collection of this information. The proxies now provide the server with an IP address that no longer identifies the user. And now the coarse location is an active signal as opposed to something available to pervasive surveillance. I agree with your vision of how the world should be, but we can't will that into existence, so we have to take pragmatic steps towards it. >> >> > Watson > Privacy sandbox proposals like accept hints are much more powerful frameworks than is possible from GeoIP info which is unconditionally given by browsers by virtue of connecting. When the experience through a privacy preserving solution is worse, people turn it off. We're better off with client controlled hints. >> >> +1 >> >> > Ted > If I understand the authors' thinking, if a client adheres to this MUST the situation will be no worse than the current situation, as it simply moves the lookup to the client. Since this MUST is not part of the wire protocol, I have some serious reservations that it will be honored, but this proposal is definitely better than the previous situation in which how the client got the data was out of scope for the proposal. >> >> You're correct that some of this comes down to trust that the client (in most cases, a Web browser or similar application) doesn't abuse this. At the end of the day, anyone using a given browser has to place some trust in that browser. After all, the browser could be sending all the user's passwords to the NomCom using syslog, but we all trust our browser to not do that. A very relevant example is the W3C Geolocation API thanks to which any website can request permission to see the user's location. If the user consents [1], then their GPS coordinates are sent to the website. An evil browser could skip the mandatory user consent step, but browsers don't. If they did, I suspect the ensuing loss of market share would dwarf any monetary benefits they might have gotten from selling such data. >> >> [1] https://www.w3.org/TR/geolocation/#user-consent >> >> > Ted > I would like to understand how the authors believe the browsers will identify a cooperating geo-ip server. Are the authors presuming that the VPN servers or proxies will provide this service? >> >> The main deployment model the authors are looking into is privacy proxies (e.g., Google's IP Protection and Apple's iCloud Private Relay). The privacy proxy operator would provide the geoIP lookup service. >> >> > Ted > One of the authors' stated goals is to improve the accuracy of the day ("This approach will not only enhance geolocation accuracy"). It seems to me that it only does that in the case that the client is using a VPN or proxy; how can it do that in the direct connection case if it is using either the same geolocation data the server would have or what it would get from a different public database? >> >> Our proposal makes the assumption that the privacy proxy's geoIP database is higher quality (i.e., more up-to-date) than the one the average website has access to. The assumption holds for the privacy proxy deployments that are interested in this feature. For example, Google's geoIP team spends quite a bit of resources to ensure that our database has high accuracy; whereas we've found that it's common for website operators to get a free copy of a geoIP database once and then never update it. >> >> > Ben > If implemented today, this proposal would privilege vendor-provided proxies over third-party proxies. This seems undesirable to me. >> >> As it stands today, no browser offers open APIs to switch out their privacy proxy provider. In particular, you'd need to formalize exactly what kind of blinded tokens are used, and a very high number of knobs such as token refresh rate, proxy failure backoff timers, etc. That's something that could happen one day, but it's a large effort no matter what. This proposal doesn't change that: you'd just need to add the URL of the geoIP service to the very long list of configuration parameters that would need to be specified. >> >> Thanks, >> David >> >> On Tue, Oct 22, 2024 at 7:31 AM Ben Schwartz <bemasc@meta.com <mailto:bemasc@meta.com>> wrote: >>> I understand this draft as a way to override the server's geo-IP database for cases where it is giving "wrong" answers. However, the draft also says: >>> >>> > The client MUST determine geolocation using a cooperating server that looks up the client's IP address in a geo-IP database. ... the IP address used to generate this geolocation hint MUST be ... the "egress IP address" >>> >>> So to be precise, this draft is about allowing the client to select a "better" geo-IP database. In practice, "better" means "affiliated with my current proxy (VPN) operator". However, in current browsers and operating systems, a proxy operator has no way to inform the operating system of an affiliated geo-IP database server. If the operating system or browser vendor chooses the geo-IP database server, then only vendor-affiliated proxies will benefit from improved answers under this system. >>> >>> If implemented today, this proposal would privilege vendor-provided proxies over third-party proxies. This seems undesirable to me. >>> >>> This problem could be resolved by showing that these platforms will offer open (but proprietary) APIs to configure a geo-IP lookup server, by changing the geolocation rule from provenance to granularity (so that the platform can derive it from GPS), or by linking this proposal to a standard for network-based location that a proxy could override (e.g. DHCP GEOCONF_CIVIC, RFC 4776). >>> >>> --Ben >>> From: internet-drafts@ietf.org <mailto:internet-drafts@ietf.org> <internet-drafts@ietf.org <mailto:internet-drafts@ietf.org>> >>> Sent: Friday, October 18, 2024 11:38 PM >>> To: i-d-announce@ietf.org <mailto:i-d-announce@ietf.org> <i-d-announce@ietf.org <mailto:i-d-announce@ietf.org>> >>> Cc: ietf-http-wg@w3.org <mailto:ietf-http-wg@w3.org> <ietf-http-wg@w3.org <mailto:ietf-http-wg@w3.org>> >>> Subject: I-D Action: draft-pauly-httpbis-geoip-hint-01.txt >>> >>> >>> >>> Internet-Draft draft-pauly-httpbis-geoip-hint-01.txt is now available. It is a >>> work item of the HTTP (HTTPBIS) WG of the IETF. >>> >>> Title: The IP Geolocation HTTP Client Hint >>> Authors: Tommy Pauly >>> David Schinazi >>> Ciara McMullin >>> Dustin Mitchell >>> Name: draft-pauly-httpbis-geoip-hint-01.txt >>> Pages: 7 >>> Dates: 2024-10-18 >>> >>> Abstract: >>> >>> Techniques that improve user privacy by hiding original client IP >>> addresses, such as VPNs and proxies, have faced challenges with >>> server that rely on IP addresses to determine client location. >>> Maintaining a geographically relevant user experience requires large >>> pools of IP addresses, which can be costly. Additionally, users >>> often receive inaccurate geolocation results because servers rely on >>> geo-IP feeds that can be outdated. To address these challenges, we >>> can allow clients to actively send their network geolocation directly >>> to the origin server via an HTTP Client Hint. This approach will not >>> only enhance geolocation accuracy and reduce IP costs, but it also >>> gives clients more transparency regarding their perceived >>> geolocation. >>> >>> The IETF datatracker status page for this Internet-Draft is: >>> https://urldefense.com/v3/__https://datatracker.ietf.org/doc/draft-pauly-httpbis-geoip-hint/__;!!Bt8RZUm9aw!6mICNhw7bj76_cBbl-3D72eXYBcZxFGvFOI54tNs5lTdKMqdRbZpFhfeP8IJ_d5AZgN56hbXoGe7HJafNsNz1Q$ >>> >>> There is also an HTML version available at: >>> https://urldefense.com/v3/__https://www.ietf.org/archive/id/draft-pauly-httpbis-geoip-hint-01.html__;!!Bt8RZUm9aw!6mICNhw7bj76_cBbl-3D72eXYBcZxFGvFOI54tNs5lTdKMqdRbZpFhfeP8IJ_d5AZgN56hbXoGe7HJauIOfmgA$ >>> >>> A diff from the previous version is available at: >>> https://urldefense.com/v3/__https://author-tools.ietf.org/iddiff?url2=draft-pauly-httpbis-geoip-hint-01__;!!Bt8RZUm9aw!6mICNhw7bj76_cBbl-3D72eXYBcZxFGvFOI54tNs5lTdKMqdRbZpFhfeP8IJ_d5AZgN56hbXoGe7HJZikroa3Q$ >>> >>> Internet-Drafts are also available by rsync at: >>> rsync.ietf.org::internet-drafts >>> >>> >>>
Received on Thursday, 24 October 2024 16:04:20 UTC