fyi: Security on the Web - John Kemp

Security on the Web - John Kemp
http://www.w3.org/2001/tag/2011/02/security-web.html

--
[plain text facsimile]


Security on the Web
John Kemp
4th February 2011


Abstract

This document summarizes the history, and current state of some of the security 
features of the Web, as well as some of the effects on overall Web security of 
(some of) these features.


Introduction

The Web (initially physically implemented as an HTTP server, a Web browser and 
the HTTP protocol) was born with a particular set of properties in mind. 
Initially, security was not one of these properties. As the Web grew, and its 
components were extended to support new models and ideas, it became necessary 
to consider the security properties of the Web, however it was difficult to 
either reliably deploy or enforce security properties broadly because of the 
massive installed-based of Web agent software, so incremental "patches" were 
applied, leading to an arms race between those making the patches, and those 
creating the attacks (particularly in the area of the "same-origin policy").


History

HTTP - the "stateless" protocol

Requirements of the Web (largely from [1])

     Remote access (from one machine or network to another)
     De-centralization (no central control of the overall system)
     Layered protocols (IP, DNS, HTTP et al)

HTTP 0.9 ([2])

     GET method only
     HTML the only response format
     No headers (meaning no fine-grained control by client of server, or 
vice-versa)
     No authentication mechanism defined
     No confidentiality mechanism defined
     Simple entity model (client requests a document of a server, server 
delivers requested document)

HTTP 1.0 ([3])

     Added POST and PUT (among others)
     Added security considerations
     HTTP BASIC authentication defined, with extensibility for other client 
authentication mechanisms
     Added HTTP headers which provided both clients and servers with more 
fine-grained control of the protocol
     Acknowledgement ("Safe methods" section) that software agents may 
"represent" a "user" on the Web
     Redirection is described (including the option for a client to redirect 
without user intervention)


Origin cookies are first baked

Netscape Navigator first implemented support for cookies in its version 2.0 
browser, dating from 1996. Cookies offered a mechanism to allow a server to 
store per-client state, and have the client supply a (server-assigned) pointer 
to its state, automatically (via the client implementation) when sending any 
request to the cookie-specified domain and URL path. Many sites used this 
facility to identify a user session with the site, and then stored 
per-user/session data (such as a shopping cart) related to the cookie identifier.

Cookies became successful because they were more reliable session indicators 
than competing mechanisms (such as putting session state in the URI or body of 
an HTTP request, which require that users don't accidentally drop the session 
part of the URI, for example)

In order to ensure that a cookie was sent only to the originating domain, the 
browser needed to be able to determine the domain associated with a document - 
and thus, the "origin" was born - scheme, host and port defining a unique 
origin. The same-origin policy states that a document from one unique origin 
may only load resources from the origin from which the document was loaded. In 
particular this applies to XMLHttpRequest calls made from within a document. 
Images, CSS and dynamically-loaded scripts are not subject to same-origin policy.


AJAX: The rise of the cross-domain request [5]

In 1999, Microsoft released the second version of its MSXML library, which 
provided access, via ActiveX, to an API for making a direct HTTP request to a 
web server. Other browsers followed, with Mozilla releasing a Javascript 
XMLHttpRequest (XHR) API in its Gecko engine, and later by other browser 
vendors releasing similar APIs.

The "XHR" API was later standardized by W3C [6] and Javascript-based frameworks 
have abstracted the API and popularized its usage by making it even easier to 
make HTTP requests from within Web documents.

Various mechanisms are used to circumvent the same-origin policy enforced by 
XHR, including JSONP [7] and allow cross-domain calls from Web applications.


The rise of Javascript and Web applications

Web browsers have contained a Javascript execution environment since Netscape 
Navigator 2.0. Initially, Javascript was denigrated by "serious" programmers, 
but thanks to the popularity of AJAX, has become one of the most popular 
software development languages in use today.

Exposing APIs from a Web browser, via Javascript and the DOM, allows a 
developer to manipulate client resources in a much more fundamental way than 
simply rendering and viewing a document.

There are several security vulnerabilities related to the use of Javascript in 
browser environments (see [9] for examples)


Security-related issues of Web architecture

     Documents (representations of Web resources) are often formed of content 
acquired from more than one "security domain" (an environment defined by a 
single set of security policies). Interactions between these pieces of content 
must be mediated in a "sandbox" environment on the client to prevent the 
possibility of content from one security domain causing problems with content 
from another security domain.

     Web browser redirects often take place without user input (see 'cookies' 
below) causing unintended user consequences.

     Web browser state management has been based on cookies, which are a shared 
client (browser) resource - one site may cause another's cookie to be sent in a 
request to the site which "owns" the cookie, causing that site to believe that 
the user is making an intentional, and authenticated request, when in fact, 
this may not be true (such as clickjacking attack)

     Identity-spoofing of Web sites on the Web is relatively easy (Referer 
header spoofing, DNS rebinding and cache poisoning, confusing the user with 
content which looks authentic but is controlled and presented by an attacker)

     Servers often depend on a client to "do the right thing" in providing 
security for the server (such as correctly process Web 'origin' and 'referer' 
information in order to allow the server to authenticate a request) but clients 
are open to manipulation by servers, and software defects. Not all clients will 
"do the right thing" -- by design.

     Authenticated protocols are based on un-authenticated protocols (for 
example, no true link between SSL certificate validation and the DNS IP address 
for the common name in the certificate)

     No separate "download", "install" and "execute" steps for a user. Web 
content is often immediately executed by the client, without giving the user a 
chance to approve access to sensitive or limited client resources (such as CPU 
and local storage)

     Documents, or excerpts thereof, are usually not tied to their publisher in 
any way that can be verified across the Web (such as by an interoperable 
cryptographic signature) (see for example "content-centric networking" [8])


Desirable security properties of the Web?

     That one Web agent doesn't have to inordinately trust the correct 
behaviour of a whole class of Web agents when exposing a resource to the Web

     That it is possible to "tie" one layer of Web protocol to other layers 
(DNS IP address should be tied to IP address of SSL cert, SSL cert key used to 
sign token at app layer protocol etc.) so that when necessary they cannot be 
separated.

     That it is possible to load or embed all Web resources from multiple 
security domains in a consistent manner (unlike the current situation where 
images and CSS are not subject to the same-origin policy, and where scripts may 
be dynamically added to a page (via the <script src=.../> tag) without being 
subject to the same-origin policy


Some current Web security-related standards work

     IETF WebSec
     CORS
     UMP
     XHR2
     HTML5 Sandboxed iFrame
     DNSSEC
     ECMAScript Strict Mode

References

     1
     2
     3
     4 (CORS, UMP and XHR2 summary)
     5
     6
     7
     8
     9
     10 (Same-origin policy weaknesses)

Security on the Web, John Kemp, 4th February 2011

---
end

Received on Monday, 31 October 2011 16:50:19 UTC