W3C home > Mailing lists > Public > semantic-web@w3.org > August 2014

Re: The ability to automatically upgrade a reference to HTTPS from HTTP

From: Roy T. Fielding <fielding@gbiv.com>
Date: Sat, 23 Aug 2014 09:37:46 -0700
Message-Id: <EDFB398C-82F1-42B4-9CD0-B1E97430B4FF@gbiv.com>
Cc: Tim Berners-Lee <timbl@w3.org>, Public TAG List <www-tag@w3.org>, SW-forum Web <semantic-web@w3.org>
To: "ashok.malhotra@oracle.com" <ashok.malhotra@oracle.com>
There is not, and never has been, any shared authority between a site on port 80 and a site on 443. While they might appear the same on the most popular web properties, they are usually not the same on the long tail of the internet. They usually aren't operated by the same org, are not subject to the same admin procedures, and in many cases aren't even aware of the other's existence except when some fool browser developer assumes the web is nothing more than the sum of walled gardens they inhabit.

I'd really appreciate if we could stop overreacting to what was well known 20 years ago. The public web is public. Telling everyone to use https only works if we have both secure certificates for free and double the available network bandwidth, and even then it won't stop the connection metadata from telling a snoop exactly what the user is viewing. Our assumption has always been that the ability to freely view the world's information with low latency and at nearly no cost was more important than cloaking all users in anonymity.

We should encourage the use of https whenever the payloads require confidentiality, but we should not pretend that makes it anonymous. In almost all cases, https gets combined with authentication and shared state that identifies each individual user. Anonymous browsing is far better supported by proxies (routers) that mask the metadata, but they don't work at all with e2e encryption.

If we want to design a new Web that is secure from oversight, as opposed to the existing one that is deliberately insecure to promote shared resources, then we have to start over at the routing and naming layers. TLS is only sufficient to hide the content of header fields and regular payloads. Telling the application layer to hide all traffic is a waste of time if we don't also perform the kinds of masking used by Tor.

Of course, if we do such a thing, it will be immediately overwhelmed by the tragedy of the commons, because no one likes anonymity more than the scum that send spam, phish,  zombie attacks, and other abuse.

I personally think we are better off with both a public and a private Web, a little more education about when TLS ought to be required, and services that can mediate between the two (for authenticated commentary that does not need to reveal identity).

....Roy


> On Aug 22, 2014, at 3:19 PM, ashok malhotra <ashok.malhotra@oracle.com> wrote:
> 
> This is a good proposal!
> If you are serving different resources from http://xyz and https://xyz,  dude, that's your problem!
> 
> All the best,
> Ashok
> 
>> On 8/22/2014 1:00 PM, Tim Berners-Lee wrote:
>> There is a massive and reasonable push to get everything from HTTP space into HTTPS.
>> While this is laudable, the effect on the web as a hypertext system could be
>> very severe, in that links into http: space will basically break all over the place.
>> Basically every link in the HTTP web we are used to breaks.
>> 
>> Here is a proposal, that we need this convention:
>> 
>>     If two URIs differ only in the 's' of 'https:', then they may never be used for different things.
>> 
>> That's sounds like a double negative way of putting it, but avoids saying things we don't want to mean.
>> I don't mean you must always serve up https or always serve up http.
>> Basically we are saying the 's' isn't a part of the identity of the resource, it is just a tip.
>> 
>> So if I have successfully retrieved https:x  (for some value of x) and I have a link to http:x then I can satisfy following the link, by presenting what I got from https:x.
>> I know that whatever I get if I do do the GET on the http:x, it can't be different from what I have.
>> 
>> The opposite however is NOT true, as a page which links to https:x requires the transaction to be made securely.  Even if I have already looked up http:x < i can't assume that I can use it for htts:x.  But for reasons of security alone -- it would still be against the principle if the server did deliberately serve something different.
>> 
>> This means that if you have built two completely separate web sites in HTTPS and HTTP space, and you may have used the same path (module the 's') for different things, then you are in trouble. But who would do that?   I assume the large search engines know who.
>> 
>> I suppose an exception for human readable pages may be that the http: version has a warning on it that the user should accessing the https: one.
>> 
>> With linked data pages, where a huge amount of the Linked Open Data cloud is in http: space last time I looked, systems using URIs for identifiers need to be able to canonicalize them so tht anything said about http:x applies equally to https:x.
>> 
>> What this means is that a client given an http:  URL in a reference is always free to try out the HTTPS, just adding an S, and use result if the  is successful.
>> Sometimes, if bowser security prevents a https-origin web page from loading any http resources as Firefox proudly does, [1], is you are writing a general purpose web app which has to read arbitrary web resources with XHR, ironically, you have to serve it over HTTP!     In the mean time, many client libraries will I assume need to just try HTTPS as that is all they are allowed.
>> 
>> Or do we have to only build serious internet applications as browser extensions or native apps?
>> 
>> For this any many related reasons, we need to first get a very high level principle that if a client switches from http to http of its own accord, then it can't be given misleading data as a result.
>> 
>> I suspect has been discussed in many fora -- apologies if the issue is already noted and resolved, and do point to where it has
>> 
>> TimBL
>> 
>> [1] https://blog.mozilla.org/tanvi/2013/04/10/mixed-content-blocking-enabled-in-firefox-23/
>> 
>> 
>> 
>> 
>> 
>> 
>> In order for this switch to be made, transitions
> 
> 
Received on Saturday, 23 August 2014 16:38:13 UTC

This archive was generated by hypermail 2.4.0 : Tuesday, 5 July 2022 08:45:38 UTC