- From: David Schinazi <dschinazi.ietf@gmail.com>
- Date: Mon, 28 Oct 2024 18:05:23 -0700
- To: Stephen Farrell <stephen.farrell@cs.tcd.ie>
- Cc: Dustin Mitchell <djmitche@google.com>, Ted Hardie <ted.ietf@gmail.com>, Ben Schwartz <bemasc@meta.com>, Watson Ladd <watsonbladd@gmail.com>, "ietf-http-wg@w3.org" <ietf-http-wg@w3.org>
- Message-ID: <CAPDSy+6cCDVafNUT0g4sn0XPZ=420H3nEPqkYJg=O8Yxp9Og2Q@mail.gmail.com>
Hi, I'm combining multiple responses into one email, and adding to what Dustin wrote. To Ted's first point, the draft authors are all absolutely open to changing our technology choices. However, we can't commit to changing our employers' business practices - as you can imagine, that is beyond our paygrades. So we have to solve the technical problem at hand within the confines of what we have control over as engineers. So Stephen, while fixing all abuses of user location on the Internet is a worthwhile goal, it's not something that we can achieve with an RFC. From Ted's email, it's clear that we (the draft authors) did a poor job of explaining which IP the geolocation data is derived from. Let me first define terminology: 1) client IP. This is the IP that a publicly-accessible server would see if the client were to open a direct TCP connection to that server. If the client is behind a NAT (or multiple NATs), this is the public IP of the NAT furthest from the client. 2) proxy egress IP. This is the IP that a publicly-accessible server would see if the client were to open a proxied TCP connection to that server, where all the application-layer bytes are flowing through the proxy. When a privacy proxy is enabled, that means that when a user is visiting a website, the website will see the proxy egress IP. If the privacy proxy is disabled, the website will see the client IP. The key property we want here is that the new location information that we're sending provides no more information than what can be derived solely from the client IP. To use an example: * my client IP identifies my household, and combined with other data such as screen size and user agent, pretty much identifies my device - so all in all it identifies "David Schinazi" * the geolocation data that we derive from my client IP would map to "San Francisco" * the proxy egress IP would map to "Northern California" Our goal here is to give the user a browsing experience where searching for "pizza" will find restaurants in San Francisco, not Sacramento. Because otherwise the user disables the privacy proxy, and we're back to leaking their client IP to all websites. In order for us to reach these goals, we need the geolocation to be based on the client IP while only providing websites with the proxy egress IP. I can't think of a way to make that work purely with communication between the proxy and website. David On Mon, Oct 28, 2024 at 10:23 AM Stephen Farrell <stephen.farrell@cs.tcd.ie> wrote: > > Hiya, > > On 28/10/2024 16:56, Dustin Mitchell wrote: > > > > This also provides an opportunity to incrementally improve the situation > > for tracking of users' location by making it an active signal that is > under > > the control of the client (rather than the client's ISP). > > Is the above actually accurate though? ISTM that detailed location > data is mostly sent by clients in payloads, so adding an HTTP header > seems like it has no effect on such application layer leakage other > than to add a new way in which location details can be exposed. > > Defining the problem to be addressed to be only a tiny part of the > abuses of location data seems to me a wrong starting point in a > different sense to the one Ted described already. > > Cheers, > S. > >
Received on Tuesday, 29 October 2024 01:05:40 UTC