Re: Comments on Content Security Policy 1.0 Editor's Draft from Adam Barth on 2012-06-27 (public-webappsec@w3.org from June 2012)

From: Adam Barth <w3c@adambarth.com>
Date: Tue, 26 Jun 2012 17:42:33 -0700
To: "=JeffH" <Jeff.Hodges@kingsmountain.com>
Cc: W3C Web App Security WG <public-webappsec@w3.org>
Message-ID: <CAJE5ia8adtzcQxwGX0Vq-iEVTg4yBFd4JsdvWUsOjNE=f7DZ5g@mail.gmail.com>
On Thu, Jun 21, 2012 at 3:09 PM, =JeffH <Jeff.Hodges@kingsmountain.com> wrote:
> Comments on Content Security Policy 1.0 Editor's Draft
> http://dvcs.w3.org/hg/content-security-policy/raw-file/tip/csp-1.0-specification.html
>
> The CSPv1.0 spec has markedly improved over the last year+ set of revisions.
> Hat tip to AdamB and the working group on that.

Thanks.

> Although the list of comments below appears somewhat long, they largely
> concern fine-grained aspects of the spec. Overall, I found the spec to be in
> good shape -- it is well organized, readable, essentially complete, etc.
>
> I suggest the below comments be appropriately incorporated/resolved prior to
> going formally into Last Call.

Ok.  I'll edit them into the document presently.

> I hope this helps,
>
> =JeffH
>
>
> Substantive comments:
> ---------------------
>
>
> Overall..
>
> Are there any salient differences between HTML5 and HTML4 resources that
> browser implementers will have to code for in terms of implementing CSP
> policy enforcement?  Other than <applet> ?

Not that I know of.  HTML5 defines most of these things as legacy
features.  We've used HTML5 as the "base" version of HTML in CSP, but
I don't think causes any problems.  Did you have anything specific in
mind?

> Also, the <source> and <track> html5 elements have "src=" attrs, but aren't
> mentioned in the CSP spec.  should they be mentioned along with <video> and
> <audio> wherever the latter occur ?

Done.  It's not strictly necessary to mention them because we just
like <video> and <audio> as examples, but it's better to be explicit,
so I've added them.

> the spec should probably mention that it applies to today's state of the
> world, as described by it and it's normative references, at time of
> publication. Thus if someone e.g. invents some new javascript/DOM gizmo
> that's a possible avenue for exploitation, and it gets implemented in some
> web user agents (new stuff is apparently getting added and "shipped" all the
> time it seems), the CSP spec, UAs' CSP implementations, and existing web
> application CSP policies may not account for it.

Correct.

> This raises the question of "does the spec accurately and completely
> specify, as of Last Call exit time, all the various facets of
> HTML/DOM/javascript that are applicable for all the directive enforcement
> algorithms?" -- it may be good to get some appropriate experts (who haven't
> yet been directly involved in CSP) to do a fine-grained double-check (during
> Last Call say) just to make sure (?).

Agreed.  Hopefully some of the folks who review the spec during Last
Call will be able to help us out in this way.

> All the directive sections effectively contain, after their description of
> the ABNF, what I'll term "directive enforcement algorithms", but these
> aren't denoted as such. however "3.2.11 report-uri" includes the overall
> statement "the user agent must use an algorithm equivalent to the
> following:"   Should the other directive sections include such a statement
> (I could go either way). if not, perhaps it is unnecessary in 3.2.11 ?

I'm not sure it matters that much.  For report-uri, added that text
because there are a number of steps involved.  In other cases, the
requirements fit into one step, so there didn't seem to be a reason to
introduce it as an "algorithm".

> within "2. Conformance"..
>
> given that we have the one directive, "sandbox", which is marked as
> "(optional)"...
>
>  3.2.10 sandbox (Optional)
>
> ..should the "conformant user agent" definition (perhaps that phrase should
> be in <dfn> element?) thus become..
>
>  A conformant user agent is one that implements all the requirements
>  listed in this specification that are applicable to user-agents, and
>  MAY implement those marked as "(Optional)".

Done.  I actually used this formulation, which amounts to the same thing:

      <p>A <dfn>conformant user agent</dfn> MUST implement all the
      requirements listed in this specification that are applicable to
      user-agents, and MAY implement those marked as "(Optional)".

> Plus, such a change might have implications for this sentence from 3.2.10
> Sandbox...
>
>  The sandbox directive is optional. If the user agent does not
>  support the sandbox attribute, the user agent must ignore every
>  sandbox directive.
>
> ..i.e. perhaps it could be simplified (haven't really thought about it
> though).

Yeah, the second part of that sentence doesn't really mean anything.
I've shorted it to just say "The sandbox directive is optional."

> Also, the first sentence is..
>
>  As well as sections marked as non-normative, all authoring guidelines,
>  diagrams, examples, and notes in this specification are non-normative.
>
> ..however, there's four occurrences of "Note:" in various places within
> overall section 3 that are arguably normative, so perhaps "note" should be
> dropped from the above sentence.

That's actually part of the respec boilerplate.  I've changed the one
note that I found that was actually normative to no longer be a note.

> within "2.1 Terminology"..
>
> This section, in its five opening paragraphs, defines key concepts (e.g.,
> application of security polices to protected resources), not simply
> terminology. Perhaps it should be renamed "Key Concepts and Terminology", or
> split into two subsections.

I've renamed it to "Key Concepts and Terminology".  I'm sure there's
much value in splitting it up into separate sections.

> This para seems somewhat inaccurate/confusing given terms and notions that
> are used elsewhere in the spec..
>
>   "A server transmits its security policy for a particular resource as a
> collection of directives, such as default-src 'self', each of which controls
> a specific set of privileges for a document rendered by the user agent."
>
>  a suggested clarification..
>
>  "A server transmits its security policy for a particular protected resource
> as a collection of directives, such as default-src 'self', each of which
> controls a specific set of privileges for that protected resource as
> instantiated by the user agent."

Done.

> this section also states..
>
>  "The Augmented Backus-Naur Form (ABNF) notation used in this document
>   is specified in RFC 5234. [ABNF]"
>
> However, header field ABNF uses "1#policy". The "#rule" is an ABNF extension
> defined in RFC2616 HTTP (and in httpbis) and thus should be appropriately
> normatively cited.

Done.

> within "3.1 Policy Delivery"
>
> This section says for both Content-Security-Policy (CSP) and the
> Content-Security-Policy-Report-Only (CSPRO) Header Fields that..
>
>  "A server may send more than one HTTP header field named ... with a given
> resource representation."
>
> However, later in section "3.5 Implementation Considerations" the following
> is stated..
>
>  "administrators ought to remember that the user agent will enforce only the
> first policy statement it receives. To enforce multiple policies, the
> administrator must combine the policy into a single header. "

Oops.  This is left over from when servers couldn't send more than one
header.  I've removed this sentence.

> Plus the ABNF for the header fields is..
>
>  1#policy
>
> ..which indicates that a header field can have a comma-separated list of
> policies..
>
>  Content-Security-Policy: policy, policy, ...
>
> ..where..
>
>  policy            = [ directive *( ";" [ directive ] ) ]
>
> Which all seems to imply that for the aforementioned policy combination (of
> multiple occurrences of policies and/or CSP header fields), that they /can/
> be simply combined by forming a single header field with the polices comma
> separated. However clarifying how that might work in the context of
> enforcement of "the first policy statement it receives" needs to be figured
> out.

We now enforce them all.

> Also, the term "policy statement" occurs only once in the spec in that
> sentence above and thus it isn't clear whether a "policy statement" is the
> value of a "policy" production, or whether it is the CSP/CSPRO header field
> value.

That sentence is gone now, so we don't need to worry too much about it.

> The above ought to be sorted out and appropriate language added to section
> "3.1 Policy", and possibly "3.5 Implementation Considerations", as well as
> appropriate cross-references between the two sections.
>
> Additionally, the final paragraph of section 3.1.2 beginning "A server may
> monitor one policy while enforcing another policy." should arguably be a
> subsection on it's own, i.e. as section 3.1.3.

I've actually just moved it to the Processing Model section, which is
a more sensible place for it.

> The spec implies that the CSP/CSPRO header fields are not
> mutually-exclusive, but doesn't explicitly state that.

That's what that paragraph is intended to state explicitly.  Should we
try to make it clearer?

> within "3.1.3.2 Source List"...
>
> in <dfn>source expression</dfn> it says...
>
>  the set of URIs which are in the same origin as the protected resource
>
> for clarity, "origin" here should be a link back to dfn>origin</dfn> in
> terminology

Done.

> in the #matches-a-source-expression algorithm, it declares..
>
>  Let scheme, host, and port be the scheme, host, and port of the
>  URI, respectively
>
> ..where the only contextually declared difference between the two different
> sets of the scheme, host, and port terms being their display style (italics
> vs plain). Thus I found it somewhat difficult to parse the non-trivial
> #matches-a-source-expression algorithm text.
>
> I suggest renaming the first set of scheme, host, and port to something like
> "uri-scheme, uri-host, and uri-port", and retaining their italicized style
> treatment.

Done.

> within "3.2.2 script-src"..
>
> WRT operator eval, function eval, function Function, setTimeout function,
> setInterval function -- they should probably explicitly note they are
> javascript constructs and reference the appropriate ecmascript spec

Done.  setTimeout and setInterval are actually from HTML5, but I've
referenced them appropriately.

> given the language the directive enforcement algorithm employs for
> 'unsafe-inline' and 'unsafe-eval', i.e. "If <one of them> is not in allowed
> script sources..", then given a directive of say..
>
>  script-src example.com 'unsafe-inline'
>
> ..then the 'unsafe-inline' effectively obviates inline script protection,
> _but_ eval will be blocked for all sources other than from example.com.

Well, eval will be blocked from example.com too.  It's just that you
can load scripts from example.com, not that those scripts can use
eval.

> It might be worth it to explain such nuances more explicitly. Perhaps could
> expand example 3 (which needs more full explanation anyway wrt the 'unsafe-'
> keyword-sources), maybe add an example 4 using script-src and 'unsafe-', and
> reference them from the directive sections that mention the 'unsafe-'
> keyword-sources.

I've added more explanation to Example 3, but I haven't added another
example.  I'd rather not encourage folks to use 'unsafe-inline'
because that defeats the XSS protections afforded by CSP.

> within "3.2.3 object-src"..
>
> where "nested browsing context" is mentioned ought to reference:
> <http://dev.w3.org/html5/spec/single-page.html#nested-browsing-contexts>
> ("nested browsing context" is also mentioned elsewhere so a centralized defn
> might be appropriate)

Done.

> In the fourth para where it says..
>
>  "It is not required that the consumer of the element's data be a plugin
>   in order for the object-src directive to be enforced. Data for any object,
>   embed, or applet element must match the allowed object sources in order
>   to be fetched. This is true even when the element data is semantically
>   equivalent to content which would be restricted by one of the other
>   directives, such as an object element with a text/html MIME type."
>
> It isn't clear to me what the implication of the final sentence is -- why
> does it matter if the content "would be" restricted by other directives ?
>  Does it mean I don't need to use the object-src directive in some cases, or
> that i had better use all applicable directives, or something else ?

I've added the term otherwise: "equivalent to content which would
otherwise be restricted"

The <object> tag is very general.  It can display all kinds of things,
including HTML, images, etc.  This paragraph is just meant to clarify
that it's the object-src directive that applies in these cases.

> within "3.2.6 media-src"..
>
> should <source> and <track> html5 elements be mentioned here too?

Done.

> within 3.2.10 sandbox (Optional)
>
> "forced sandboxing flag set" is presently a simple link to the definition in
> HTML5 spec of that term. I think also having an explicit citation to [HTML5]
> would be helpful for the reader here, perhaps saying something like
> "Sandboxing is defined in [HTML5], and link to
> <http://dev.w3.org/html5/spec/single-page.html#sandboxing>"

I've added a citation to HTML5, but without any accompanying words.

> Also, it seems that HTML5 doesn't actually define any explicit profiles of
> required flags in a "forced sandboxing flag set", saying only near the end
> of step 20 of "5.6.1 Navigating across documents" that
>
>   "A resource by default has no flags set in its forced sandboxing flag
>    set, but other specifications can define that certain flags are set."
>
> So is there an explicit profile of required flags in a "forced sandboxing
> flag set" that CSP can reference and user agents can implement?  Otherwise
> it seems that it will be difficult to be able to implement sandbox
> enforcement in an interoperable fashion.  Or am I misunderstanding this?

There aren't any profiles.  There's just the set of sandbox flags
defined in http://www.w3.org/TR/html5/the-iframe-element.html#attr-iframe-sandbox

> within "3.2.11 report-uri"..
>
> Is a "dictionary" a defined JSON concept?

You're right.  The correct term is "object".  I've added a citation to
the JSON RFC.

> the manner in which the term is
> used in this section seems to imply this. if so the concept should be
> referenced wherever it's defined.

Done.

> I find this confusing (until reading to the Note at the very end)..
>
>  "Fetch the report URI from origin of the protected resource, with the
>   synchronous flag not set, using HTTP method POST, with a Content-Type
>   header field of application/json with an entity body consisting of the
>   violation report. The user agent must not follow redirects when fetching
>   this resource. (Note: The user agent ignores the fetched resource.)"
>
> In that we're trying to simply say "send the report to the address given by
> the web origin of the protected resource, using HTTP POST method" (yes?)

Correct.

> i.e. the term "Fetch" threw me off, and here it isn't linked to the HTML5
> "fetching" algorithm.  If in W3C parlance "fetch" is the right thing to use
> even here, perhaps parenthetically explain that this will result in
> "sending" the report.

I've added a link to the HTML5 spec.  It's just meant to invoke the
HTML5 fetch algorithm to get the HTTP request set up properly (e.g.,
with the right referrer).

> unfortunately, it seems the HTML5 spec doesn't, other than form submission
> (which is defined it seems such that it's pretty strictly tied to the <form>
> element processing), define an algorithm for "sending" arbitrary data, and
> thus we have the above tortured use of "fetch" to mean "send"  (I suppose
> this is a comment on an arguably missing piece in the HTML5 spec?)

The fetch algorithm is pretty widely used in HTML5 and in other specs.
 Maybe it would help to move the bit about ignoring the response
earlier in the paragraph so that the reader's expectations are set
more clearly?

> within "3.4 Security Considerations"..
>
> The privacy leakage aspects of the violation reports (c.f. the list
> discussion) should be noted.

I've added the following text:

      <section>
        <h3>Violation Reports</h3>

        <p>The violation reporting mechanism in this document has been
        designed to mitigate the risk that a malicious web site could use
        violation reports to probe the behavior of other servers. For example,
        consider a malicious web site that white lists
<code>https://example.com</code>
        as a source of images. If the malicious site attempts to load
        <code>https://example.com/login</code> as an image, and the
        <code>example.com</code> server redirects to an identity provider (e.g.,
        <code>idenityprovider.example.net</code>), CSP will block the request.
        If violation reports contained the full blocked URL, the violation
        report might contain sensitive information contained in the
redirected URI,
        such as session identifiers or purported identities. For this
reason, the
        user agent includes only the origin of the blocked URI.</p>
      </section>

> Should provide (informative?) reference(s) for:
>
> resource representation

Done.

> content injection vulnerabilities

This is used only in the introduction.  Did you have a reference in mind?

> cross-site scripting (XSS)
>
> bookmarklets

These seem valuable to reference, but I'm not sure there are great
references for them other than what you can find easily with a search
engine.

> plugins  -- http://dev.w3.org/html5/spec/single-page.html#plugin

Dine.

> Add to normative references:
>
> CSSOM

Done.

> DOM ?  ("DOM APIs" are mentioned once in "3.2.9 connect-src")

I've reworded this reference.  It's not intended to refer to the DOM
specs (e.g., domcore).

> ecmascript  -- due to normatively referring to eval et al

Done.

> Extensible Stylesheet Language Transformations (XSLT)

Done.

> ecmascript

Done.

> HTTP  -- RFC2616 mentioned in many places, and in ABNF in 3.2.10 sandbox

Done.

> RFC3986 (URIs) -- is ref'd in ABNF in 3.2.11 report-uri

It's referenced as [URI].

> XML (e.g., for where the spec mentions "XML document")

Done.

> editorial items:
> ---------------
>
> overall..
>
> the HTTP spec should be formally referenced. a convenience ref for the
> often-occuring "400 HTTP response" would be nice
> a direct link for 400 response is
> <https://tools.ietf.org/html/rfc2616#section-10.4.1>

Done.

> the header fields defined in this spec are formally "HTTP response header
> fields" and should be designated as such (at least in the first occurrences
> in section 3.1)

Previously when I did something like that, Julian complained that
there was no such thing as a "response" header field.  There were just
header fields.

> eg "The server delivers the policy to the user agent via an HTTP response
> header" should be "The server delivers the policy to the user agent via an
> HTTP response header field."
>
>
> in terminology, some referenced specs are referred to as "specification",
> others are referred to as "standard" -- should pick one for consistency (i
> vote for "specification").

Done.

> the term "document" is used in a few places where it arguably should be
> "resource", "resource representation", or "protected resource"
>  - an occurrence is in "2.1 terminology" and is already corrected above
>  - two occurrences in "3.3.2 Sample Violation Report"
>  - an occurrence in "3.4.1 Cascading Style Sheet (CSS) Parsing"

Fixed.

> within "1. Introduction"..
>
> "the server need to supply" should be "the server needs to supply"

Done.

> within "2.1 Terminology"..
>
> [Origin] should point to RFC6454 :)  and be added to the Normative
> References list.

Fixed.

> [WEBSOCKET], [EVENTSOURCE] should be added to the Normative References list.

Done.

> expand "CSS" acronym

Done.

> within #matches-a-source-expression..
>
> "If the source expression a single U+002A ASTERISK"  should be "If the
> source expression consists of a single U+002A ASTERISK" ?

Fixed.

> "if the URI has the same scheme, host, and port as the resource's URI"
> should be "if the URI has the same scheme, host, and port as the protected
> resource's URI" ?

Yes, fixed.

> within "3.1.4 Processing Model"..
>
> "Generally speaking, enforcing a directive prevent"  should be "Generally
> speaking, enforcing a directive prevents" ?

Fixed.

> "enforcing the policy will break the web application"  could be "enforcing
> the policy will cause the web application to malfunction"

Fixed.

> "Whenever a user agent runs a worker"
> Should probably have a formal reference to <http://www.w3.org/TR/workers/>
> spec.

Fixed.

> within "3.2 Directives"..
>
> "In order to protect against Cross-site Scripting (XSS), authors should
> include"    ..should more clearly be..
>
> "In order to protect against Cross-site Scripting (XSS), web application
> authors should include in their CSP policies:"

Fixed.

> within "3.2.2 script-src"..
>
> reference for XSLT and "stylesheets" ?

Fixed.

> within "3.2.4 style-src"..
>
> CSSOM ?  expansion & reference?

Fixed.

> within "3.2.7 frame-src"..
>
> html5 has a specific subsection entitled "Navigating nested browsing
> contexts in the DOM", perhaps should reference?

I've linked to <http://www.w3.org/TR/html5/history.html#navigate>,
which is slightly more specific.

> within "3.2.11 report-uri"..
>
> "Prepare a dictionary violation dictionary" should be just "Prepare a
> violation dictionary" (even though the first occurance of dictionary is
> plain text and "violation dictionary" is italics)

This has changed a bit now that we're more precise about referencing
JSON.  I've made changed violation object to violation-object, which
should also help.

> Nits:
> -----
>
>
> section "3.1.3 Syntax" should perhaps be "3.1.3 Syntax and Algorithms"

Done.

> "parse a CSP policy" should be a section (so it shows up in the TOC)

Done.

> "parse a source list" should be a section (ditto)

Done.

> "URI matches a source expression" should be a section (ditto)
>
> "URI matches a source list" should be a section (ditto)

I put these two into one section called Matching.

> ---
> end

Thanks for all your comments!

Adam
Received on Wednesday, 27 June 2012 00:43:36 UTC