Thoughts on HTTPS Links rewriting from Francois Daoust on 2008-11-14 (public-bpwg-ct@w3.org from November 2008)

From: Francois Daoust <fd@w3.org>
Date: Fri, 14 Nov 2008 18:37:29 +0100
To: public-bpwg-ct <public-bpwg-ct@w3.org>
Message-ID: <491DB759.80701@w3.org>

A few thoughts on section 4.2.8.2 HTTPS Links rewriting:
http://www.w3.org/2005/MWI/BPWG/Group/TaskForces/CT/editors-drafts/Guidelines/081107#sec-https-link-rewriting

I don't think we'll receive much more comments from external people at
this point, and think we should just move on with what we already have.

Link with the IETF TLS Task Force
-----
I sent an email to the IETF TLS task force:
http://www.ietf.org/mail-archive/web/tls/current/msg02968.html

The only reply confirmed that "the server cannot detect the attack
(unless of course he requests client certificates) via the TLS protocol,
the client could. The server could detect the attack by noticing few IPs
that make many different transactions".

I also exchanged a few private messages with persons subscribed to that
mailing-list. The reason I mention this is that one person mentioned
that the "Via" header that we say the CT-proxy MUST add in this case may
not be a good idea for three reasons:
1. we are trying to use the "Via" to advertise a security issue, and
it's always better to be explicit.
2. the proxy is not truly acting as a proxy anymore. The request
originates from the proxy in that case, so it's not exactly a "Via".
3. the server could be composed of a proxy that receives the request,
decrypts it and passes it to another server. In that case, the
server-side proxy may add a "Via" header. The problem is not that it may
override the one set by the CT-proxy (it should properly complete the
header if it's already there), but that there is a case where the final
server receives a request with a "Via" HTTP header, even when the
request is made in HTTPS. From an external point of view, it's still in
the server black box.

I agree, but do not see any other real possibility at this point, apart
from creating a new explicit HTTP header field "Man-in-the-middle", but
I don't think that's such a good idea.

Last last call comment
-----
Thomas sent an additional comment to warn about cases where rewriting
HTTPS links break :
http://lists.w3.org/Archives/Public/public-bpwg-comments/2008OctDec/0002.html

I just sent a reply with a view to getting a more precise view on what
this comment refers to. AFAICT, it is focused on Web Applications, and
I'm not sure that it adds new cases to the list of cases that may break
we already are aware of.

Change of domain use case
-----
Eduardo raised an important use case I don't think we've considered before:
http://lists.w3.org/Archives/Public/public-bpwg-ct/2008Nov/0030.html

When the page that contains the HTTPS link is hosted by some other
server, then there is no way to the targeted server to prevent the
re-writing of the HTTPS links.

Two solutions:
1. Forbid rewriting of HTTPS links that target another hostname. Since
SSL certificates are per hostname (is that right?), I think that's a
consistent guideline. The main visible consequence of such a guideline
would be that a search engine that proposes HTTPS links in a search
result cannot offer transcoded views of such links. The guidelines could
say:
"Proxies MUST NOT rewrite HTTPS links when the hostname targeted by
the link does not match the one of the resource being retrieved"

2. Emphasize that servers should detect the presence of a Via HTTP
header in an HTTPS request, and reply with a 406 if they don't want
their response to be visible by a CT-proxy. The problem is that I don't
see how a CT-proxy could "recover" in an automated way in such a case.

It depends on what we're trying to say here: if it's "the user must have
the choice" then 1. goes too far in that it prevents users from making
the choice. If it's "content providers must have their say too" then 1.
is a good solution.

I would personally prefer 1.
Any other thoughts and/or solutions?

The text in section 4.2.8.2
-----
Pending the potential addition of the above mentioned guideline, I think
the thrust of the section is good.

We may want to add a list of cases where rewriting HTTPS links will just
break: if the server uses client certificates, the fact that the server
may refuse the request if it comes from another address, and the rest of
the cases I can think of are triggered by the change of origin which may
not be specific to HTTPS. It could take the form of an additional note.
I'm not so sure this is useful (same as for encoding, in the end, the
message would probably read "avoid bugs").

Another minor point. I think we've already discussed that but can't
recall when. About "If a proxy re-writes HTTPS links, replacement links
must have the scheme https", it is slightly too restrictive in that the
replacement link is likely to be for a page that advertises the user
about the security implications of doing that, and that page doesn't
need to be in HTTPS, only the final links need to be. Anyway, I like the
concision and clarity of this guideline, just thought it was worth
nothing somewhere.

The text for content providers
-----
I see only two things we may say:
- "Send a Cache-Control: no-transform if you do not want HTTPS links to
be rewritten"
- "Detect the presence of the Via HTTP header in HTTPS requests and
return with a 406 if you do not want to send sensitive data to the proxy".

Francois.

Received on Friday, 14 November 2008 17:38:11 UTC