Hundreds of identity providers from Bush,Judith on 2021-11-30 (public-fed-id@w3.org from November 2021)

From: Bush,Judith <bushj@oclc.org>
Date: Tue, 30 Nov 2021 21:15:49 +0000
To: dan sinclair <dsinclair@google.com>, Brock Allen <brockallen@gmail.com>, "public-fed-id@w3.org" <public-fed-id@w3.org>
Message-ID: <BL0PR06MB4497E3DA5E51077F0BA25593CB679@BL0PR06MB4497.namprd06.prod.outlook.com>
  *   When you say there are hundreds of customers with their own IDPs, does
this mean the customer has an IDP installed on-premises that the
customer upgrades and runs? Or is it a SaaS solution where they
configure a solution run by your company? Or is it a mix of both?

While there is some migration to cloud solutions in the higher Ed space, a significant number have their own IdPs or IdP proxies in the research and eductation federated identity space.

Go to https://technical.edugain.org/entities, select Entity Type = Identity Providers, choose “SAML 2.0 support” = “Supported SAML 2.0”  click show, and the edugain collection of SAML 2 metadata returns 4617 across 71 different identity federations. Some of these federations are hub federations, such as the Netherlands SURFnet. There a relying party sends a request to the SURF hub, the SURF hub relays the request to the institutional IDP, the IDP releases attributes and a response to SURF,  SURF may enrich that response with attributes from “virtual organizations” such as a international laboratory, SURF then applies the enduser’s consent choices (when appropriate), SURF then builds a response to the requesting relying party.

In the InCommon Federation, the relationships are direct, IdP to SP.

Note that granting organizations such as the National Institution of Health (a US organization that gives international grants) require that the claims sent along in the SAML request include demonstration that the user went through a multifactor auth.

judith


From: dan sinclair <dsinclair@google.com>
Date: Tuesday, November 30, 2021 at 15:39
To: Brock Allen <brockallen@gmail.com>
Cc: public-fed-id@w3.org <public-fed-id@w3.org>
Subject: [External] Re: feedback on FedCM from BlinkOn
Hi Brock,

First off, thank you very much for sending this along, the feedback is
greatly appreciated (and apologies for my slow response). See inline responses below.

One general note, the spec is in no way finished. We're pushing it out in a state
where we think it's a good place to start the discussion, as opposed to being "done".
So any and all feedback at this point is great as we evolve the spec.


On Thu, Nov 25, 2021 at 3:01 AM Brock Allen <brockallen@gmail.com<mailto:brockallen@gmail.com>> wrote:
> First, my understanding is that there are lots of other clever ways to thumbprint browsers beyond cookies, so the premise that cookies are the main threat seems over simplified. Thus killing cookies in this way seems heavy handed given the collateral damage.. If my understanding is correct, then the trackers will just work harder to use the other existing approaches. Perhaps I'm wrong and/or missing something? If so, I'd love the ELI5 [technical] explanation. This threatens the premise of this whole exercise.


You are correct in that there are various ways to fingerprint a user.
Cookies make this easier as the third-party cookie is sent along in
iframe requests. So, if you get a cookie set, and can embed a small
iframe, you're done. This is probably the simplest of ways to track
the user. The various other ways to fingerprint are also an issue, but
(at least in my mind) they are slightly more difficult to exploit.

Deprecating third-party cookies isn't the end of this process, it's a
starting point. The https://privacysandbox.com<https://privacysandbox.com> site has more information
on the various initiatives.

We're in the process of creating use cases for things which will stop
working after third-party cookie deprecation. You can follow along, and
contribute more cases, as part of the FedID CG in the
https://github.com/fedidcg/use-case-library/issues<https://github.com/fedidcg/use-case-library/issues> tracker.
These cases let us understand how 3rd party cookies are being used now
and what the right path forward for those use cases is post deprecation.


> Second, many of the examples used are of relationships between RPs and IdPs assumes social login that are third party situations (apps that signin w/ google, FB, apple, etc). There are so many more businesses out there that setup their own IdP that happens to be cross-site mechanically, but are in reality first-party in all other respects, and they have their own business rational for this. So it seems that the mindset of "oh everyone just uses one of the big social IdPs to login" is distorted from the reality of the relationships are between organizations and their users. Think of a hospital that has their own IdP to handle patients and the apps they need to login into to process their care. If this kind of thing is broken, then in your browser people won't be able to get login to get their chemo scheduling and follow up doctors visits (and I can tell you that apps in hospitals take years to get updated)..

There is no assumption in FedCM that you use an IDP from a set list.
The website provides the IDP URL to FedCM. So, if you set a provider
url of your own IDP that supports FedCM it should work correctly. If
that IDP is in a first-party-set with the websites which use the
provider, then the CHIPS proposal (https://github.com/WICG/CHIPS<https://github.com/WICG/CHIPS>)
should also help make this easier as a Partitioned cookie could be
used.


> Maybe you're aware of this already, but all the presentations I've ever seen don't seem to illustrate this level of understanding for non-social/non-enterprise login situations. I think the assumption there are only O(100) IdPs is very wrong/underestimated. Personally I have hundreds of customers/companies that run their own IdPs. I know the larger companies (Okta, Auth0, Gluu, ForgeRock, Ping, etc) have more customers than I do. That's at least many thousands of IdPs and the economic impact of this will be non-trivial.
>

When you say there are hundreds of customers with their own IDPs, does
this mean the customer has an IDP installed on-premises that the
customer upgrades and runs? Or is it a SaaS solution where they
configure a solution run by your company? Or is it a mix of both?


> Third, I think we've all learned over the years in the protocol space that having the browser involved as little as possible in communicating between the RP and IdP is a good thing. These proposals asking for the browser to take over all of this is... well... hard to grasp. It's a drastic shift from the momentum in the identity protocol space (e.g. PAR). The proposals I have seen in the video are trying to mediate protocol flows that are too simplistic (implicit flow w/ form posting id_tokens as the most common thing).. Most scenarios I work thru don't pass id_tokens thru the browser, so these solutions seem presumptuous on protocol flows. Also, how do access tokens (for the underlying business APIs) fit into all of this, since that's the other half of what these protocols deliver to RPs?

We started with the implicit flow returning id_tokens as that was the
initial starting point we knew was broken (something like a single
page application needing access without top-level redirection). We're
investigating how this looks when you return access_tokens and if we
need to also do refresh_tokens and what that looks like.

As mentioned above, we'd appreciate any feedback on use-cases which
have not been specified. If there are setups that will be broken we'd like to
know about them to understand the impact and course correct as
needed.


> Fourth, how do I get to write custom login UI during the login process?

The option to customize the login UI is something that has come up
previously but we have not had time to investigate at this point. The
current thinking is something like Fenced Frames
(https://github.com/shivanigithub/fenced-frame<https://github.com/shivanigithub/fenced-frame>) would be beneficial
here but we don't know yet. See
https://github.com/WICG/FedCM/blob/main/explainer/cookies.md#sign-in-1<https://github.com/WICG/FedCM/blob/main/explainer/cookies.md#sign-in-1>
for a bit more information.

>  How does new user registration work in this world?

Do you mean new user registration on the IDP side or on the website?
Users would sign up to their IDP the same way they do now, I don't
think anything would change. The first time the user visits a website
which requires a credential they would be presented with an account
chooser for the IDP (if they have more than one account) and a consent
dialog (currently presenting links to the Privacy Policy and Terms of
Service). The acceptance of the consent dialog would then allow the RP
and IDP to know about each other.

The account chooser is populated by the browser making a request to the
IDP and providing the IDP cookie (but not referring to the RP). The IDP
can then provide a list of accounts (or an empty list if there are no accounts)
for the user.


> What if I need to dynamically prompt for a MFA (or not) depending on the state of things? I might need to prompt the user for their email to federate to an additional external IdP (thus an additional hop) based on what the user inputs. I might need to accept custom parameters from the RP that affect the login workflow. I might need to introduce a 15-page workflow for a new EULA during login because of a change in the corporate policy.. These things are very common and the exact reason companies setup their own IdP for their suite of apps and APIs. If the browser forces itself in the middle, that's a real deal breaker here, it seems.

All of these cases sound like the user is logged out of the IDP (or,
essentially logged out of the IDP as they don't get access to anything
until they've done additional steps). From the FedCM perspective, I'd
expect the IDP to return an empty account list to FedCM. FedCM would
terminate, and tell the RP there is no credential. The user would then
be (in theory) on the sign-in page for the website and can follow the
flow they would have performed previously (which I'm guessing is a top
level navigation?)


> Why should the browser see user identifiers at all (and it doesn't quite help in the goal of reducing the the no-tracking goal)? So the assumption that solving login redirects w/ id_tokens and signout iframes as the main feature is wrong. I'm not sure this is the browser's job.
>
> Finally, in my own personal browsing I use Firefox. I configure it to remove all cookies when I close the browser. I then explicitly configure the sites where I want cookies to be persistent. This seems like a nice way to achieve the purported goal without breaking the world. Why not make something like this the default? It's a hell of a lot simpler. I'd love to hear why this wouldn't work,

That solution does work, but it has other trade offs. Clearing the cookies
on browser close means a user session never survives a browser restart
which is not the expected behaviour many users have at this point. Users
expect to be logged in the next time they open their browser. The explicit
choice to be logged out after restart is a valid one to make, but I don't
think it's the norm, or the expected default.

This also ties your privacy to how long you keep your browser running. For
folks who only restart when the browser updates, that could be a long period
of time in which they receive no protections.

> and I was hoping back when we heard from the Firefox team that something like this wasn't a possible solution back when we had the 2-day W3C identity event. Again, assume I'm thick, so I'm sure I'm missing something here.
>

Can you expand on "heard from Firefox that what wasn't a possible solution"? (I joined the
team after the W3C identity event happened so was not present for the
discussion). I'd appreciate any context you can provide.


> I'm sure as I actually spend more time getting into the technical details, I'll come across more issues/scenarios/questions. I'm happy to be wrong on all of my immediate reactions -- please share any links that corrects me on these concerns.
>

Please share your issues (if it's with the FedCM to https://github.com/wicg/fedcm<https://github.com/wicg/fedcm>), scenarios
(https://github.com/fedidcg/use-case-library/issues<https://github.com/fedidcg/use-case-library/issues>)  and questions.

If you'd like more background context, there is a bunch of exploration
documentation in the FedCM repo which you can see at
https://github.com/WICG/FedCM/tree/main/explainer<https://github.com/WICG/FedCM/tree/main/explainer> (apologies as I may
break that link in the near future to rename it to
https://github.com/WICG/FedCM/tree/main/explorations<https://github.com/WICG/FedCM/tree/main/explorations> to better explain
what it is).

The FedCM spec (https://wicg.github.io/FedCM/<https://wicg.github.io/FedCM/>) itself also has a bit
of background, but not as much as the explainer does.

The overarching project can be seen at https://privacysandbox.com<https://privacysandbox.com/> with
a timeline (which also conveniently lists many of the projects
involved) at https://privacysandbox.com/timeline<https://privacysandbox.com/timeline>.

Thanks again for the feedback.
dan
Received on Tuesday, 30 November 2021 21:35:32 UTC