Re: Exposing TLS & Certificate Information in Javascript from Seetharama Rao Durbha on 2013-05-29 (public-webcrypto-comments@w3.org from May 2013)

From: Seetharama Rao Durbha <S.Durbha@cablelabs.com>
Date: Wed, 29 May 2013 20:50:22 +0000
To: Ryan Sleevi <sleevi@google.com>
CC: Richard Barnes <rbarnes@bbn.com>, Douglas Stebila <stebila@qut.edu.au>, "public-webcrypto-comments@w3.org" <public-webcrypto-comments@w3.org>
Message-ID: <CDCBC305.E964%s.durbha@cablelabs.com>

Ryan
Thank you very much for the list of example issues. Very insightful. However, I still have questions. Please see inline.

--Seetharama

On 5/29/13 1:53 PM, "Ryan Sleevi" <sleevi@google.com<mailto:sleevi@google.com>> wrote:

This is far from an exhaustive list, but is provided as proof.

Note that this point has actually been studied quite a bit with
different browser vendors. There's a reason that proposal's such as
Channel ID, which Harry mentioned, are much more preferable, in that
they provide a layer of persistence that transcends the direct
transport layer.

1) Redirects
- eg: "GET /" -> "Location: http://example/bar"
Is the SSL/TLS certificate that of the original GET request or of
the Location?
Is the SSL/TLS session (eg: for key material export, as raised by
Tom Ritter) that of the original GET or of the new resource?
- Surprise: UA's disagree.
Not sure why UAs disagree – to me it appears that the certificate from second GET is what matters, because that is from where I am getting my actual content.
2) Renegotiations
- eg: "GET /" -> renegotiation -> headers -> renegotiation -> body
The SSL/TLS certificate may have changed at any point in that flow
(yes, it really *does* happen)
For servers that do things like request client certificates after
the headers, user agents may have persisted the headers from the
original request, but then tear down the session and establish a new
one AFTER prompting the user for a client certificate. Both
certificates apply to the security domain.
- Surprise: UA's disagree.
Again, not sure why UAs disagree. Renegotiation is fine, but the body gets downloaded after the SECOND renegotiation – which means the latest server certificate that is used to download the body is what I am interested in.

3) Cache Validation
- eg: "GET /" -> "304 Not Modified"
Is the SSL/TLS certificate that of the original GET request, or that
of the cached response?
- Surprise: UA's disagree.
Not sure I see a distinction – 304 is coming as a response for a GET request sent over TLS, and so the 304 response is as reliable as any content.

4) If you're doing things right, you're *NOT* delivering script inline
(c.f. CSP), but instead loading it via a script src directive.
Is the SSL/TLS certificate that which delivered the script? Or that
which loaded the "main" content?
Is the SSL/TLS session (eg: for key material export) that of the
script or of the main content?
As I mentioned in my earlier email, I think we are talking about the TLS used to download the main content – not the TLS of the script. May be we should clarify the requirement as such.

5) Partial content
- eg: "GET /" -> "206 Partial Content" -> "GET /" -> "206 Partial Content"
Is the SSL/TLS certificate that of the first half of the partial
content? Or the second half? or the Nth half?
Can we treat this as an exception case and call out that certificate information will not be available if the certificate differs across each GET request?

I'm a strong advocate of exposing more information, but I think it's a
fundamentally flawed premise to couple the application layer (eg: JS
APIs) to the fundamental details of the transport layer, at least for
the "open web" applications.

This could certainly be something of interest for SysApps style use
cases - in which the API doing the inspection/extraction is
*independent* from the resource being loaded. Both Chromium and
Firefox have experimented (to some success, arguably) with such APIs:
eg: Chromium's webRequest.

The closest comparable API is the ResourceTiming and NavigationTiming
APIs - but note that as part of its design, it actually exposes a very
high-level/abstracted view of the details - and avoids notions such as
"the" connection.

Cheers,
Ryan

On Wed, May 29, 2013 at 12:36 PM, Seetharama Rao Durbha
<S.Durbha@cablelabs.com<mailto:S.Durbha@cablelabs.com>> wrote:
How is plurality of connections argument applicable when we are talking
about the TLS connection used to get 'the HTML' - we are not talking about
anything done to process the HTML (like downloading a script/CSS/image/etc).
Isn't the lock displayed for the connection used to get the HTML itself and
nothing else?

--Seetharama

On 5/29/13 12:34 PM, "Ryan Sleevi" <sleevi@google.com<mailto:sleevi@google.com>> wrote:

Richard,

As I explained previously, there is not inherently a concept of a
single TLS session for the associated load of the 'main page' (from
whence origin is derived).

Further, while tempting to reduce it to such primitives, it's quickly
evident that this fails to provide or address any value added security
benefits, when it fails to express the code executing in the overall
environment.

While the Same Origin Policy is wholly sufficient for notions such as
Origins, and can be extended through the use of CORS, the application
of dynamic content over a plurality of TLS connections, each of which
may have multiple negotiations established with them, make notions
such as "THE" certificate or "THE" keying material fundamentally
flawed.

Cheers,
Ryan

On Wed, May 29, 2013 at 11:32 AM, Richard Barnes <rbarnes@bbn.com<mailto:rbarnes@bbn.com>> wrote:

Ryan,

I'm a little confused here. Origins are also fundamental for web security
and used programmatically. There's a well defined origin based on the base
document URI, even though a given page load can come from multiple origins.
If I import a script from another origin, it still executes within the
overall origin for the page.

Could we not do something analogous here? Just as the origin for the page
is derived from the URI for the base page, couldn't we just define that the
TLS information provisioned is for the connection that loaded the base page?
After all, that information is the root of trust for the page, since all the
other resources are loaded based on information retrieved over that
connection.

--Richard

On May 29, 2013, at 12:28 PM, Ryan Sleevi <sleevi@google.com<mailto:sleevi@google.com>> wrote:

On Tue, May 28, 2013 at 10:17 PM, Douglas Stebila <stebila@qut.edu.au<mailto:stebila@qut.edu.au>>
wrote:

We have been doing some research on building application-level cryptography
on TLS connections. In one of our recent projects, we wanted to
cryptographically bind from the application layer to the TLS connection.
There are several ways of theoretically doing so, but the seemingly simplest
would be to get (the hash of) the server's X.509 certificate from the TLS
connection (this is one of the mechanisms specified in RFC 5929, TLS channel
binding). In our application, only the Firefox extension API allowed us to
access that information (via XPCOM). It would be nice to have a
standardized way of doing this, and it seems like this may fall under the
category of secondary features in the Web Cryptography API charter.

Ryan Sleevi kindly pointed out a discussion thread on this mailing list from
February 2013 that discusses things related to this issue, where an API
exposing a variety of information on the TLS connection was proposed. The
subsequent discussion in the thread pointed out the subtleties of what is
"the" TLS connection
(http://lists.w3.org/Archives/Public/public-webcrypto-comments/2013Feb/0006.html;
portions reproduced below). I apologize for being late to the discussion
and reopening a dormant conversation, but it did not seem to come to a
resolution, beyond that there are subtleties.

While it is true that there are subtleties, getting at least some
information about the TLS connection would be a very useful thing to have
available, and it may be possible to identify a canonical set of TLS
parameters. In fact, browsers effectively do so: when you click on the lock
icon, you get a single certificate and a single explanation for the
properties of the TLS connection. I'm not sure which canonicalization
browsers use, but two potentially reasonable choices include "the first
certificate used on the main document", or "the most recent certificate used
on the main document". Yes, a single snapshot doesn't capture the whole
history of the security context, but it does capture the security parameters
at that canonical point in time, and that's enough to enable some
interesting applications.

In summary: can we have an API that gives the same information about the TLS
connection as what would be obtained by clicking on the lock icon in the web
browser?

Short answer: No

And for the reasons I detailed on that thread.

That lock is actually quite misleading for expressing overall security
policy, but it is enough of a hueristic to be acceptable for the
security goals it tries for end users. A programmatic API cannot be
heuristic-based like that, particularly to meet your use cases.

Cheers,
Ryan

Douglas

window.location.tls = {
version = 'SSL 3.0' || 'TLS 1.0' || 'TLS 1.1' || 'TLS 1.2' || '',

//I'm really bad at naming things
flavor = 'PKIX' || 'SRP' || 'PSK' || 'OpenPGP',

ciphersuite = { //From
https://www.iana.org/assignments/tls-parameters/tls-parameters.xml
value = UInt8Array, //From the Value column
description = string //From Description Column
//Potentially fill out sub values like "Cipher", "Key Exchange"...
},

//.certificates is an array of Certificate Objects, or an
// empty array if no certificate is used (HTTP, TLS-PSK, DH-Anon, etc)
// [0] is the root, and it goes in ascending order to the leaf,
// based on the path constructed by the browser
certificates = [
CERTIFICATEOBJECT,
...
]
},

The failure of this entire proposal is that it disregards the
multi-connection, multi-origin model involved in any origin load.

Yes, except for the main page.

No. Even the main page may have had multiple TLS identities involved.

1) It may have originally requested example.com, but may have been
redirected (302 to 307) to subdomain.example.com. However, the identity of
both is relevant in terms of origin security, since the initial example.com
may have been hijacked by an attacker to leverage items such as session
pinning or cookie hijacking.
2) If the server is not TLS renegotiation patched (as a disproportionate
number of servers unfortunately remain unlatched), then a hostile MITM may
interject themselves before initiating a renegotiation. For example, your
site https://www.ianonym.com is vulnerable to this well-known attack (only
noticeable after ignoring the certificate mismatch).
3) Even absent hostile intent, a server may be configured to renegotiate the
security parameters of the server in such a way that fundamentally alters
the connection. This is quite common.
4) In the face of invalid certificates or requests for client certificates,
which require user interaction, many user agents will break the TCP
connection after a certain amount of time has elapsed, since the active
connection is contingent upon user interaction. Thus a logical load may have
employed multiple connections.

Received on Wednesday, 29 May 2013 20:51:06 UTC