Re: [w3ctag/design-reviews] Private Proof API (Issue #1071) from Martin Thomson on 2025-05-22 (public-webapps-github@w3.org from May 2025)

From: Martin Thomson <notifications@github.com>
Date: Thu, 22 May 2025 02:50:14 -0700
To: w3ctag/design-reviews <design-reviews@noreply.github.com>
Cc: Subscribed <subscribed@noreply.github.com>
Message-ID: <w3ctag/design-reviews/issues/1071/2900596415@github.com>

martinthomson left a comment (w3ctag/design-reviews#1071)

This proposal is substantially similar to the Private State Token API (https://wicg.github.io/trust-token-api/, https://github.com/w3ctag/design-reviews/issues/780), but it seeks to address *one* of the key privacy concerns with that approach.

Private State Tokens allowed a site to write a bit[^one] when it was in a top-level or same-site context and read that bit when it is embedded in a cross-site context. The major problem with that from a privacy perspective is that no one really got to know what the bit meant. So while the intent was that this was supposed to signal that the site considered the user to be trusted[^trust].

[^one]: In practice, allowances in the API for key rotation and other things meant that this was sometimes *two* bits. And sometimes there could be an additional "hidden" bit. Either way, the intent was that this be just a single bit.

[^trust]: Again, what "trust" means here is vague and whether it is the user or their browser that was trusted is unclear.

The change here is to move to a single bit again, but strictly limit what that bit can mean. In the same-site context, a site is able to sign the current time. That genuinely carries a single bit into the browser that the site cannot control. In the cross-site context, the site can then ask the browser for proof that the time it previously signed was before a time of its choice. The site then either receives that proof or not, providing it with a single bit. The site has some control, because they can choose how old the original signature was, but they don't get anything more than that.

For anti-abuse purposes, this is useful because it allows sites to call out to an anti-abuse provider and separate potential abusive visitors into two groups: those with a history longer than some predefined age, and those without. Those with a long history are unable to spoof that (with caveats), which allows the site to (at their discretion) focus their abuse protection mechanisms more closely on those without a long history.

This strikes *a* balance between the anti-abuse needs and privacy, for sure. Whether it strikes an *appropriate* balance depends on several factors, which we'd like to see more detail on:

1. One of the major drawbacks of Private State Tokens is the natural bias toward centralization. If you don't limit the number of entities that can supply tokens, the API is basically a fingerprint-creation API. This proposal is no different in that regard. In particular, this creates a tendency for sites to partner with anti-abuse vendors who have been around longer and those who have been exposed to more users. The proposal does not appear to (currently) have robust safeguards in place for that bias. We'd like to see some more analysis about what the options might be here. The explainer section on side channels is not really adequate to understand the full privacy implications of this.

1. The idea of issuer "fungibility" explored in the explainer seems impractical as described, but the general idea that the choice of issuer is not revealed through the proof is worth exploring as a potential mitigation for the natural centralization tendency in the API. We understand that this has a range of challenges -- it isn't clear what would incentivize the necessary cooperation, it requires new and potentially different cryptography, plus it might have other centralization biases -- but we want to encourage further investigation along these lines.

1. The underlying cryptography includes rate limiting measures, which allows sites to specify a maximum number of times that their signature can be used in a given time period. This exists to prevent a user from sharing the signature they receive with others. There is discussion in the explainer of putting these rate limiting parameters in a `.well-known` location on a site, but this seems to be subject to change, which might enable abuse. We'd like to see this more clearly specified, with clear guardrails on use so that they cannot be abused. This might require some analysis to support the choices made.

1. We'd like to see some consideration given to device portability of the underlying keying material that is used. Does your trustworthiness reset if you get a new phone? (This might be a desirable property because it means that there are more reasons that someone might appear too "new", more below.)

1. It's not clear that it is safe to expose enrollment in this system in a cross-site context without revealing additional fingerprinting bits.

1. Given the high computation cost involved in producing the bit, what safeguards can be put in place by user agents to ensure that this does not lead to a different form of abuse? Particularly on battery-powered devices that tend to have less computation resources.

1. Some work clearly needs to be done to safeguard against revealing side channels. Is it not possible to unconditionally perform the computation and return an invalid value if no token exists? This also applies to the `hasToken()` API, which leaks a fingerprinting bit.

1. The API involves a fetch for all invocations, which seems unnecessary. Though both generation of signatures and validation of proofs might need to occur on servers, imperative JavaScript APIs to manage the necessary interactions are sufficient.

1. Other similar APIs have included a non-trivial base rate of forced errors, even for users who have valid tokens. This is not so much to provide a measure of privacy, though it could provide a small measure of differential privacy, but more to ensure that sites do not become too dependent on the API being able to produce valid proofs. This forces sites to provide a reasonable user experience for people who cannot satisfy their challenge. After all, this API encodes a bias against new or younger users. What, if anything, do you plan to do to encourage sites to be responsible in that regard.

--
Reply to this email directly or view it on GitHub:
https://github.com/w3ctag/design-reviews/issues/1071#issuecomment-2900596415
You are receiving this because you are subscribed to this thread.

Message ID: <w3ctag/design-reviews/issues/1071/2900596415@github.com>

Received on Thursday, 22 May 2025 09:50:18 UTC