[HTTPS] Thoughts on HTTPS Links rewriting from Eduardo Casais on 2008-11-17 (public-bpwg-ct@w3.org from November 2008)

From: Eduardo Casais <casays@yahoo.com>
Date: Mon, 17 Nov 2008 02:05:58 -0800 (PST)
To: public-bpwg-ct@w3.org
Message-ID: <950260.1179.qm@web45003.mail.sp1.yahoo.com>
HTTP rewriting has perhaps been the most criticized feature of the CTG, and the
latest information collected by François makes it clear that we are probably
trying the impossible -- simultaneously breaking something and making it work.

>From my perspective, we are facing the following issues:


a)	Defining transformations

HTTPS URI rewriting is a specific transformation (of the request and of the
response bodies). It has been repeatedly been said that the goal of the CTG is
to specify signalling amongst entities in the HTTP flow, not to define how and
when they should do which transformations -- but this is precisely what is
being attempted with HTTPS URI rewriting (do not do HTTPS rewriting when..., 
rewritten links must have https schema..., clients must be informed of invalid
certificates..., etc). 

As an example, the proposal to handle the use case I raised "Proxies MUST 
NOT rewrite HTTPS links when the hostname targeted by the link does not match 
the one of the resource being retrieved" is a thread that once unwinded leads 
to other difficulties: what if the domains match, because the site has an agent 
switcher (returning both desktop-optimized and mobile-optimized content)?
What if there is no domain name, but an IP address -- that may resolve to
several domain names? And we are not even sure whether HTTPS URI should be 
rewritten to other HTTPS URI or not...

It is likely that we have not even figured out all cases where HTTP URI
rewriting leads to unsuspected results. Since the CTG goal is not to 
define how to perform transformations correctly, no time should be wasted 
with patching up HTTPS URI rewriting. We have refrained to do so for 
conversions amongst character encoding, for instance; the document ought to
be consistent throughout.


b)	Other security issues

The discussion has focussed on HTTPS URI rewriting as a security issue -- which
was positive, since it highlighted the consequences of such transformations
and the necessity to provide robust signalling mechanisms to clients and 
servers.

There are however other form of rewriting and transformations that may entail
security consequences, but they have neither been considered in detail, nor
have guidelines been defined to make them "work":
1. rewriting of hidden variables (often used to keep state and session data);
2. manipulation of cookies (with man-in-the-middle security implications);
3. filtering of scripts (which are often used to carry out validations).

Dealing exclusively with HTTPS URI rewriting gives the impression the CTG is 
actually endeavouring to figure out a way to implement a proprietary feature 
of some transcoders "right" -- to the detriment of generality required for
such a document.


c)	Signalling vs. transformations

>From that perspective, the CTG should enforce the following signalling 
requirements:
1. Clients have a choice to access the original, unadulterated service over
the original, unadulterated communication link, or the transformed one over
a potentially transformed link.
2. Clients are informed of the potential general usability and security 
consequences of using transformed services.
3. Servers have a way to identify unambiguously whether a transcoder is 
present, potentially altering the service characteristics (requests,
responses, communication link).
4. Servers have a way to signal that their service must not be altered.
5. Transcoders have a basic mechanism to shut down the activity of other
transcoders upstream (requests) or downstream (responses).

The CTG is almost there regarding these requirements; it should make them 
robust. HTTPS URI rewriting is inherently brittle, and so far nobody sees
a way to make it robust enough.


d)	Missing use cases

At the risk of sounding repetitive, let me state once more that the use case
for HTTPS URI rewriting is extremely questionable. It assumes that:

1. Users are ready to forgo end-to-end security for a site where it is expected;
2. they are ready to accept the usability issues that result from transcoding,
probably heightened, because secure sites tend to rely a lot on scripts (e.g.
Javascript);
3. they are ready to accept the fact that, for all technical reasons listed by
Thomas Roessler, HTTPS URI rewriting has a high probability of not working at 
all;
4. this population of users is statistically significant;
5. there is no mobile-optimized site corresponding to the desktop one this
significant population attempts to access via transcoders.

This stretches credibility. HTTPS URI rewriting looks like an abstract, 
contrived use case, not a realistic one. Operators deploying HTTPS-rewriting 
transcoders are already whitelisting a whole range of secure sites -- further 
evidence that HTTPS rewriting and the ensuing break of end-to-end security are 
actually not acceptable. 


e)	Security considerations

What is the way forward?

The CTG should eliminate the section on HTTPS URI rewriting and refrain from 
trying to give the impression that this transformation somehow works (because
it cannot), even with all the prudent caveats already included.

Instead, there should be an independent chapter "Security considerations", just
like in IETF RFC standards, which
1. Lists transformations with potential security impact (cookies, hidden 
variables, scripts, https URI).
2. Mentions the consequences of HTTPS URI rewriting (break of end-to-end
security, various technical issues by Th.Roessler).
3. Determines that since the CTG is not a legal document, whatever happens at
the proxy level once end-to-end security is broken is undefined (i.e. whatever
happens to passwords, input, communication, etc, is undefined).
4. Refers to the signalling primitives in the main text to enable users to
decide whether to access a service via rewritten links or not, and for servers
to accept or deny a service over a rewritten link.
5. Resolves that because of 2 and 3, transformations that wilfully break 
security are not considered to be good practices.

Let us not fool ourselves with declarations like "The BPWG does not condone 
link rewriting, but..." Once the BPWG spends time (and time it did spend) 
figuring out when and how to do HTTPS URI rewriting, it is condoning the 
practice. 

This specific transformation causes plenty of security and technical 
problems, intelligent people have tried to make it work acceptably and could 
not; let us nail down the signalling issues and provide a way for clients and 
servers to make an informed decision, but let us forget about HTTPS URI 
rewriting itself.


E.Casais
Received on Monday, 17 November 2008 10:08:37 UTC