Embedding Safety Directives for Content Security Policy

This document defines a directive for the Content Security Policy 1.0 mechanism to allow Web application developers to declare a set of protections for a web resource to help prevent malicious obscuring or re-contextualizing of the resource's user interface when it is displayed in an embedded context. The document also defines a set of heuristics for Web user agents to implement these protections and a reporting mechanism for when they are triggered.

Requirements phrased in the imperative as part of algorithms (such as "strip any leading space characters" or "return false and abort these steps") are to be interpreted with the meaning of the key word ("MUST", "SHOULD", "MAY", etc) used in introducing the algorithm.

A conformant user agent is one that implements all the requirements listed in this specification that are applicable to user-agents.

A conformant server is one that implements all the requirements listed in this specification that are applicable to servers.

Terminology

This section defines several terms used throughout the document.

The term security policy, or simply policy, for the purposes of this specification refers to either:

a set of security preferences for restricting the behavior of content within a given resource, or
a fragment of text that codifies these preferences.

The security policies defined by this document are applied by a user agent on a per-resource representation basis. Specifically, when a user agent receives a policy along with the representation of a given resource, that policy applies to that resource representation only. This document often referes to that resource representation as the protected resource.

A server transmits its security policy for a particular resource as a collection of directives, such as default-src 'self', each of which controls a specific set of privileges for a document rendered by the user agent. More details are provided in the directives section.

A directive consists of a directive name, which indicates the privileges controlled by the directive, and a directive value, which specifies the restrictions the policy imposes on those privileges.

The term origin is defined in the Origin specification. [ORIGIN]

The term URI is defined in the URI specification. [[!URI]]

The <script>, <object>, <embed>, <img>, <video>, <audio>, <link>, <frame> and <iframe> elements are defined in the HTML5 standard. [[!HTML5]].

The <applet> element is defined in the HTML 4.01 standard. [[!HTML401]].

The @font-face CSS rule is defined in the CSS Fonts Module Level 3 standard. [[!CSS3FONT]]

The XMLHttpRequest object is defined in the XMLHttpRequest standard. [[!XMLHTTPREQUEST]]

The WebSocket object is defined in the WebSocket standard. [WEBSOCKET].

The EventSource object is defined in the EventSource standard. [EVENTSOURCE].

The Augmented Backus-Naur Form (ABNF) notation used in this document is specified in RFC 5234. [[!ABNF]]

The following core rules are included by reference, as defined in [ABNF Appendix B.1]: ALPHA (letters), DIGIT (decimal 0-9), WSP (white space) and VCHAR (printing characters).

The OWS rule is used where zero or more linear whitespace octets might appear. OWS SHOULD either not be produced or be produced as a single SP. Multiple OWS octets that occur within field-content SHOULD either be replaced with a single SP or transformed to all SP octets (each octet other than SP replaced with SP) before interpreting the field value or forwarding the message downstream.

OWS            = *( SP / HTAB / obs-fold )
               ; "optional" whitespace
obs-fold       = CRLF ( SP / HTAB )
               ; obsolete line folding

Directives

This section describes the content security policy directives introduced in this specification.

`embed-ancestors`

TODO: Coordinate with the IETF websec Working Group.

The embed-options directive indicates whether the user-agent should embed the resource using a frame, iframe, object or embed tag, or equivalent functionality in non-HTML resources. Resources can use this to avoid many UI Redressing attacks by ensuring they are not embedded into other sites. This directive replicates some of the functionality of the X-Frame-Options header. The syntax for the name and value of the directive are described by the following ABNF grammar:

directive-name    = "embed-ancestors"
directive-value   = source-list

Unlike policies defined in Content Security Policy 1.0, the embed-ancestors directives is not subject to the default-src directive. If this directive is not explicitly stated in the policy its value is assumed to be "*".

If 'deny' is present in the source-list, the resource cannot be displayed in an embedded context, regardless of the origin attempting to do so, and all other members of the source-list are ignored. This provides functionality equivalent to the DENY value of the X-Frame-Options header.

If 'deny' is not present the source-list indicates which origins are valid ancestors for the resource. An ancestor is any resource between the protected resource and the top of the window frame tree; for example, if A embeds B which embeds C, both A and B are ancestors of C. If A embeds both B and C, B is not an ancestor of C, but A still is.

The 'self' source indicates that content of the same-origin as the protected resource may embed it. This provides functionality equivalent to the SAMEORIGIN value of the X-Frame-Options header.

`click-protection`

The click-protection directive, if present, instructs the user agent to apply the heuistic clickjacking protections described in Section XXX to click and drag-and-drop events delivered before they are delivered to the resouce in an embedded context.

ISSUE: Need some optimization language here. e.g. If the resource is not embedded (it is topmost in the user agent rendering context) or if all ancestors are same origin this token may be ignored. TODO: this still allows attacks with multiple windows in environments where that is possible (traditional desktop OS) but to defend against this the user-visible screenshot would have to be defined in terms of the OS, not the user agent.

directive-name    = "click-protection"
directive-value   = "block" / "deliver"

A value of block indicates the user agent should not deliver the event to the resource if the click protection heuristic is triggered.

A value of deliver indicates the user agent should deliver the event to the resource, even if the click protection heuristic is triggered.

If a report-uri is present, triggering of the click protection heuristic MUST always generate a report, whether the event is delivered or not.

User agents SHOULD NOT prompt the user when the click protection heuristic is triggered.

ISSUE: is it worth having a "deliver" option, or should this just always be part of a report-only policy?

`click-protection-hints`

The click-protection-hints directive allows a resource to provide hints to the click-protection heuristics for greater accuracy.

directive-name    = "click-protection-hints"
directive-value   = ["tolerance=" num-val] ["viewport-height=" num-val] ["viewport-width=" num-val] ["ui-delay=" num-val]

If the policy does not contain explicit click-protection-hints or any of the optional values are absent, the user agent should apply default values as described in Section XXX. A user agent MAY ignore any or all values in click-protection-hints.

tolerance is a numeric value from 0-99 that defines the threshold at which the screenshot comparision procedure triggers the click protection heuristic. A value of 0 indicates that no difference between the two images is permitted. A value of 99 provides little to no practical protection.

viewport-height is a numeric value that defines the height of the viewport to be used for performing the screenshot comparision.

viewport-width is a numeric value that defines the width of the viewport to be used for performing the screenshot comparision.

ui-delay is a numeric value that specifies the delay time, in milliseconds, used in the click protection heuristic.

protected-id ??? does it make sense to allow protections to apply to a named element in the DOM, instead of a viewport window?

Click Protection Heuristic

This section is non-normative. The algorithm described here can be implemented mostly in terms of HTML5 constructs, but requries the ability to monitor and intercept actions in the rendering of a resource and delivery of events to that resource. User agents may apply equivalent protections using means more optimized for their implementation details, may ignore recommendations where the browsing environment eliminates certain classes of attack, (e.g. cursor sanity check in a touch-only environment) or may implement some features in terms of the underlying operating system or platform rather than directly in the user agent.

Algorithm Description

Listener registration - Register a "global" capturing event listener for mouse button, tapping, keyboard, drag & drop and focus events, which must be guaranteed to run before any other event handler of the same kind and therefore be able to prevent any event from being handled by the content, if needed. CBC: in order to guarantee the "first to process' event listener requirement and reduce registration overhead, ClearClick adds its listener to the Mozilla-specific DocShell object which is the immediate container of the topmost DOM window per any given tab. A crossbrowser approach likely to work is registering the listener on the topmost DOM window itself before any script has a chance to run.
Fast-track bypass - Whenever the listener is called, check whether the event target or its owner document are flagged as "unlocked". If either is, return early. CBC: ClearClick uses an expando property to flag DOM nodes and windows, relying on a feature of Mozilla's chrome-exposed DOM wrappers which prevents content from seeing or tamper with expando properties set by privileged code. Other browsers may require different procedures to safely annotate documents and other DOM nodes. Furthermore, this and most of the remaining steps assume our listener can examine and manipulate any DOM node or window independently from its origin, bypassing SOP. This privilege should be granted by the listener having being registered by privileged (browser extension) code.
Parent chain check - Check whether the event target is either a child of a nested document or a plugin content element (EMBED, APPLET or OBJECT). If it is not, or it is an embedded document belonging to a same-site parent chain (i.e. it and all its parents are from the same origin), flag the document as "unlocked" and return. Notice that the original Clickjacking demo by Hansen & Grossman worked despite the Flash content being served same-site: since plugins may follow type-specific origin policies, we never return early at this stage when interacting with plugin content, even if embedded same-site.
Rapid fire check - Check whether the previous event we had observed was the same type on a document from a different origin, happened within the past 800ms (quarantine time). If it was, we assume a "Rapid fire" attack (e.g. the user has been tricked into repeatedly click on the same or a predictable location in a fast succession while the document gets changed under his mouse pointer) : halve the quarantine time and go to step 8. If next interaction happens with a different document, the quarantine time will be reset.
Cursor sanity check - By querying computed-style with the ":hover" pseudo-class on the element (if the target is plugin content) or on the host frame element and its ancestors (if the target is a nested document), check whether the cursor has been hidden or changed to an possibly attacker-provided bitmap: if it has, go to step 7. This provides protection against "Phantom cursor" attacks, also known as "Cursorjacking".
Obstruction check - By using an offscreen HTML 5 canvas element, we take two reasonably sized (300x200 on average, but growing or shrinking depending on document's inherent size and viewport constraints and hints provided by the viewport-height and viewport-width properties of click-protection-hints) screenshots of the region centered around the DOM element which is about to receive the event: one from its owner document's "point of view" (unobstructed by definition), the other from the topmost window's. In the plugin content case, we ensure the former "screenshot" contains the element itself only. If the number of the pixels which are different between the screenshots don't exceed a certain configurable tolerance rate (default 18%, or as set by the tolerance property of click-protection-hints), return. Otherwise we tentatively assume the DOM element our user is interacting with has been obstructed or obscured by a UI Redressing attempt. CBC: the screenshots are taken by using the CanvasRenderingContext2D.drawWindow() method11, which is a Mozilla-proprietary extension of the HTML 5 Canvas API available to privileged code only, allowing the content of DOM windows to be drawn on a canvas surface exactly as rendered on the screen. The rest of this phase relies on cross-browser canvas features, instead, such as pixel grabbing and data URL serialization.

Examples

Sample Policy Definitions

This section provides some sample use cases and accompanying security policies.

Example 1: A server wishes to load resources only form its own origin:

Content-Security-Policy: default-src 'self'

Example 2: An auction site wishes to load images from any URI, plugin content from a list of trusted media providers (including a content distribution network), and scripts only from a server under its control hosting sanitized ECMAScript:

Content-Security-Policy: default-src 'self'; img-src *;
                         object-src media1.example.com media2.example.com *.cdn.example.com;
                         script-src trustedscripts.example.com

Example 3: Online banking site wishes to ensure that all of the content in its pages is loaded over TLS to prevent attackers from eavesdropping on insecure content requests:

Content-Security-Policy: default-src https: 'unsafe-inline' 'unsafe-eval'

Sample Violation Report

This section contains an example violation report the user agent might sent to a server when the protected resource violations a sample policy.

In the following example, a document from http://example.org/page.html was rendered with the following CSP policy:

default-src 'self'; report-uri http://example.org/csp-report.cgi

The document loaded an image from http://evil.example.com/image.png, violating the policy.

{
  "csp-report": {
    "document-uri": "http://example.org/page.html",
    "referrer": "http://evil.example.com/haxor.html",
    "blocked-uri": "http://evil.example.com/image.png",
    "violated-directive": "default-src 'self'",
    "original-policy": "default-src 'self'; report-uri http://example.org/csp-report.cgi"
  }
}

Security Considerations

Implementation Considerations

XXX TODO

Implementation Considerations for Resource Authors

XXX TODO suggestions on back-end anti-fraud here and use of unique, per-transaction report-uris with information about targets encoded, and use of report-only to collect possible fraud data without blocking transactions.

IANA Considerations

The permanent message header field registry (see [RFC3864]) should be updated with the following registrations:

Content-Security-Policy

Header field name: Content-Security-Policy

Applicable protocol: http

Status: standard

Author/Change controller: W3C

Specification document: this specification (See Content-Security-Policy Header Field)

Content-Security-Policy-Report-Only

Header field name: Content-Security-Policy-Report-Only