Re: I-D Action: draft-pauly-httpbis-geoip-hint-01.txt from Nick Doty on 2024-10-24 (ietf-http-wg@w3.org from October to December 2024)

From: Nick Doty <ndoty@cdt.org>
Date: Thu, 24 Oct 2024 10:47:46 -0400
To: Watson Ladd <watsonbladd@gmail.com>, HTTP Working Group <ietf-http-wg@w3.org>
Message-ID: <CA+tYtvGtpdhoSkLd4gHZ1pqTq5U+M-o=qj7w7YeuAEPhL6r82Q@mail.gmail.com>
I'm glad to see that there is a Privacy Considerations section in this
version. I believe I raised that in 2022, but I don't think this draft
yet covers all of my initial, incomplete set of considerations:
https://github.com/tfpauly/privacy-proxy/issues/196
I have some particular questions and concerns about the existing
privacy considerations writeup.

I am uncertain about this statement:

> In particular, when a
> privacy technology such as a VPN is in use, the value MUST NOT reveal
> information about the user's location that would otherwise be hidden.

Is that a requirement that we expect to be followed? My understanding
from the Introduction is that a motivation here is to make it easier
for a proxy to use a smaller number of egress IP addresses that don't
all correspond to specific geographic areas (because maintaining all
those egress addresses is complex and expensive) but still to provide
the same level of geographic detail (which would otherwise be hidden
by not maintaining the same egress IP fidelity). That would suggest
that the major implementations plan to do the exact opposite of this
requirement. But the Client Behavior section seems to argue that the
client must know the egress IP address and only use that for
determining location without additional granularity.

Or perhaps another way to think about this: some services are
currently using egress IPs with historical known geographic areas, but
once proxying is implemented, the user's location would in any case be
otherwise hidden, even if it may have been chosen by a service
provider to emulate that geolocation functionality.

Granularity also isn't described in the privacy considerations
section. It seems like the field is mandated to include down to the
city level but not to the postal code. I'm not clear why the client
shouldn't be able to decide or why the spec shouldn't recommend a less
granular setting by default. Many current uses of IP geolocation would
be satisfied by national or regional level granularity. The PEARG
draft on IP address privacy might be useful for tracking some of those
purposes or for evaluating the privacy implications of this as a
replacement signal:
https://pearg.org/draft-ip-address-privacy/draft-irtf-pearg-ip-address-privacy-considerations.html#name-rough-geolocation

This requirement is also unclear:

> The hint MUST NOT be sent by default or in an always-on manner.

I believe this means that the hint should only be sent when the server
has provided a corresponding Accept-CH response header, but doesn't
suggest that the user should have the ability to opt-in or to control
this generally or that it should only be sent to servers with some
trust relationship, or where a guarantee has been provided about how
the data will be used. If that's right, I expect many readers will be
confused by this statement, as it sounds pretty always-on.

Or there might just be some ambiguity here that should be clarified:
> and in contexts where sharing location data serves a clear purpose, such as for location-based services.
Is this an additional condition that's required, or an alternative to
provide an Accept-CH header?
Providing a location hint when the user wants to in order to support a
location-based service would be very different from providing it every
time any server indicates that it wants that information.

On Mon, Oct 21, 2024 at 8:41 PM Watson Ladd <watsonbladd@gmail.com> wrote:
>
> GeoIP does lots of useful personalization and can hurt when wrong.
> Clients providing a hint is much nicer akin to the way en-US vs en-UK
> works for deciding what subsidiary you are most interested in.
>
> If the personalization issues can use a client provided signal that
> avoids current state where many privacy preserving technologies
> provide a worse experience because of inaccurate personalization
> responses.
>
> Privacy sandbox proposals like accept hints are much more powerful
> frameworks than is possible from GeoIP info which is unconditionally
> given by browsers by virtue of connecting. When the experience through
> a privacy preserving solution is worse, people turn it off.

I'm not clear that we have evidence that the experience of a privacy
preserving proxy that doesn't automatically reveal your city to any
server that's interested is generally considered by users to be worse.
More evidence on user experience would be useful, but it also seems
likely that the reactions of servers and configuration of clients may
not be static in this regard: if IP address privacy is provided more
generally, servers may stop assuming that IP geolocation is an
accurate signal to rely on for certain location-based services and
instead provide for user-initiated functionality when users want to
geolocate.

> We're better off with client controlled hints.

I do think this is possible, if users in fact see greater control with
client controlled hints.
Received on Thursday, 24 October 2024 14:48:03 UTC