Re: Required Domain proposal for Additional Certificates from Ryan Sleevi on 2019-04-02 (ietf-http-wg@w3.org from April to June 2019)

From: Ryan Sleevi <ryan-ietf@sleevi.com>
Date: Tue, 2 Apr 2019 14:46:32 +0900
To: Nick Sullivan <nick@cloudflare.com>
Cc: HTTP Working Group <ietf-http-wg@w3.org>, Ryan Sleevi <ryan-ietf@sleevi.com>
Message-ID: <CAErg=HGY8v4=gUbOaOwZ-AX+=g6MNu1kQCpqwKx-tboiDoepHA@mail.gmail.com>
On Tue, Apr 2, 2019 at 1:24 PM Nick Sullivan <nick@cloudflare.com> wrote:

> I'm not seeing and disagreement about the assertion that this proposal
> solves the issue of key compromise. Let me know if you disagree.
>

I tried to focus on the more significant concern, as it highlights
fundamental issues. I did not want to focus on those issues while the
structural issues were still outstanding, but no, I don’t think the
proposal addresses key compromise as presently proposed.

>
> I don't think it's necessary to spell out the exact requirements for "via
> CT" in this RFC. I would be fine with a bit of text stating that the UA
> should have some expectation that domains owners have the ability to detect
> misissuance for certificates used as secondary. The decision and manner to
> implement would be left to the UA based on their view of the ecosystem and
> risks.
>
> The breadcrumb aspect of how secondary certificates need to include the
> attacker's domain is a bonus.
>

Thanks for clarifying. I think this is an area of disagreement we should
figure out how to resolve. I tried to offer a suggestion, by making sure
it’s clear what properties are necessary to achieve the claim that this is
not a fundamental security risk to potential implementors. I don’t think
the suggestion of “use CT” actually achieves this, because one can’t
examine that suggestion and rationalize whether or not a given UA’s
implementation of CT achieves those properties. Furthermore, by not
providing some guidance or assurance that those properties will be met,
site owners can not evaluate whether or not it is safe to enable this
feature, due to the lack of specific guidance to implementors on how to
achieve those properties.

This strikes me somewhat similar to the use of CBC in TLS, and leaving it
as an exercise to the reader as to how to achieve a constant time
implementation. We could opt to make it clear that an implementation MUST
be constant time, we could opt to provide a reference implementation that
achieves the constant time, or we could avoid using it entirely (or
deprecate it from relevant specifications), on the basis that it is a huge
security risk that is best accommodated by different constructions
entirely, rather than trying to safely aim the footgun. I fear the current
proposal doesn’t provide the necessary guidance as to what to expect, for
sites or implementors, and that makes this unacceptably dangerous.


> A different way to frame this would be in terms of the attacker's
> cost/benefit of:
> - misissuing a current DV certificate and using it maliciously against
> any/all clients
> - misissuing a Required Domain certificate from a CA that it enforces
> stricter validation checks and using the certificate maliciously against
> clients who have implemented secondary certificates and the recommended
> guidelines we propose in this document
>

I don’t find that to be a helpful framing, and this may be another source
of disagreement. I fear that this framing encourages us to make insecure
compromises that ossify weaknesses, on the basis that there is something
weaker, rather than address gaps and build stronger systems.

For example, CAs are taking steps to improve and mitigate agains the DNS
attacks mentioned, and we can reasonably expect the industry to continue to
do so, and to continue to mitigate DNS risks. We should not assume this
protocol is safe or secure merely because that’s not universally
distributed yet, and should instead ensure the system is robust even when
these DNS attacks are mitigated. I don’t believe the current proposal gets
there.


> This scenario seems functionally equivalent to the compromised certificate
> scenario but, instead of a malicious attacker, the party that has access to
> the compromised certificate key is a friendly party you already have a
> business relationship with.
>

I think we should be careful about trying to group attacks at this early a
stage. I don’t agree that they are the same. The existence of a past
business relationship doesn’t imply a present business relationship.
Likewise, this scenario is where the Related Domain is still in the
possession of the “attacker”, and the certificate was legitimately
authorized. The technical and policy mitigations are rather significantly
different in this regard, and the proposed solution hardly defends against
it.

Do you have data to support this? Much like HTTP/2 PUSH, I think we can
>> imagine 'idealized' models of how this SHOULD improve performance, but the
>> practical reality is often far from it, and hardly realized.
>>
>
> We haven't measured the user-visible gains yet, but we hope to collaborate
> with a browser vendor to quantify them. I'm optimistic that the performance
> gains will be significant given the ubiquity of services like cdnjs and
> jsDelivr.
>

It seems like this can be quantified well in advance of needing any UA to
implement. Whether this is modifying existing open-source browsers to
synthetically test, or by gathering the data and quantifying the potential
savings, it seems rather possible to objectively evaluate this based on the
current ecosystem, and to look at a host of metrics that this
hypothetically could affect.


> I don't want to keep harping on this, but DNS poisoning does not require
> a noisy attack like a BGP hijack
> <https://dl.acm.org/citation.cfm?id=3278516>. Also, the proposals made
> here are not astronomically costly to implement, they're modest steps in
> the direction the PKI is already going. I see the recommendations we make
> in this document as a forcing function in the direction of more secure web
> PKI practices.
>

We disagree here as well. The presumptive value of the Required Domain is
based on an assumption of certain properties being met, which are not
enumerated in the proposal, but which it is believed that “use CT” meets.
CT has taken 7 years to reach ubiquitous deployment in a single UA, and
many of the present properties provided to that UA are a result of
non-generalizable policy decisions which other UAs would not want to
implement, given the harm it would cause the ecosystem. That seems like a
rather astronomical costs, especially given the ongoing costs to maintain a
healthy CT deployment and implementation.

If you think there are modest proposals that approximate the value of CT,
without those costs, they would be great to elaborate on in the proposal.
However, I don’t think we should say “This is fine, because CT”, while not
acknowledging the significant cost it has to implement at this time.

Isn't this largely a consequence of the increased centralization of those
>> servers? That is, it seems the cost is born by client implementations -
>> which I think extends far beyond merely contemplating UAs, unless this is
>> truly meant to be something "only browsers do, because if anyone else does
>> it, it's not safe/useful" - in order to accommodate greater centralization
>> by the servers. I ask, because given the costs, a more fruitful technical
>> discussion may be exploring how to reduce or eliminate that centralization,
>> as an alternative way of reducing that connection overhead.
>>
>
> What do you mean by centralization? In the case that I laid out,
> JavaScript subresources, if a web server can serve code from a subresource
> on an established connection without checking DNS, that means multiple
> servers providers can serve the content. This allows content hosts to
> diversify the set of TLS terminating proxies they can use simultaneously
> without using hacks like the multi-CDN CNAME chain configuration that is
> causing issues for the specification of ESNI right now.
>

A server can also serve code on the same origin, far more efficiently, and
with significantly better caching properties in today’s modern browsers, if
our focus is merely interactive user agents, and without these rather
significant security considerations.

To your question though, the certificates only provide a benefit if the
same TLS terminating entity is responsible for both origins - and that
seems to centralize more. If a given service can only be “competitive” with
performance if they collocate on the same CDN as their dependent resources
(such as ad networks or other truly 3P content), then it creates
significant incentives to further internet centralization, right?

If there are hidden assumptions, we should reveal them and add them as text
> in the security considerations.
>

I think right now there’s a huge assumption in the detection property,
presumably being met via CT (with a host of assumptions of site operators
jeopardized by the mere existence of this spec), and in the responsiveness,
presumably being handled via revocation. We should be spelling out what a
safe or successful deployment looks like - or more carefully calling out
the risks, both those believed addressed and those unaddressed.


>
>> I think it'd be much more fruitful to focus on what the properties are,
>> rather than attempting to iterate on technical solutions with hidden
>> assumptions or dependencies. I suppose we'd treat that as a "problem
>> statement" in IETF land, much like we might pose it as an "Explainer"
>> within groups like WHATWG, which try to set out and articulate the problems
>> and need for a solution, and then iterate on the many assumptions that may
>> be hidden beneath those statements.
>>
>
I still believe this would be a more useful exercise than comparing to the
PKI and HTTPS system at this point in time today.

I don't see how the implementation of ORIGIN by command line tools is
> relevant. Browsers are more likely to benefit from connection coalescing in
> ways that impact user experience than command-line tools used for bulk
> downloads in which shaving off RTTs is less important.
>

It’s relevant because the overall proposal introduces significant risks,
and whose mitigations don’t have easily deployed options for these clients.
If, say, curl were to implement this feature, but without proposed
mitigations (such as an ambiguous deployment of CT or some out of band
revocation protocol), would that be safe or responsible? For site operators
who are contemplating deployment of this, would a deployment of this by UAs
that don’t feel the mitigations are necessary - as seems to be the present
suggestion, much like ORIGIN left DNS optional - lead to security risk for
site operators who service those users?

This is an area where the comparison to CBC is again apt - that a number of
browser UAs implemented constant time mitigations for CBC still leaves
servers at risk for enabling CBC ciphersuite, due to not being able to be
assured that clients negotiating CBC are actually doing so securely. This
is an ecosystem consideration, but one that seems to directly impact the
deployability of this proposal, which is why it bears consideration. The
purpose of the spec providing guidance here is not “just” for implementors
deploying servers or clients capable of negotiating this - but for the site
operators tasked with obtaining such certificates and wanting to understand
what, if any, properties the spec guarantees if they do so.

>
Received on Tuesday, 2 April 2019 05:47:09 UTC