Comments on Content Security Policy 1.0 Editor's Draft from =JeffH on 2012-06-21 (public-webappsec@w3.org from June 2012)

From: =JeffH <Jeff.Hodges@KingsMountain.com>
Date: Thu, 21 Jun 2012 15:09:24 -0700
To: W3C Web App Security WG <public-webappsec@w3.org>
Message-ID: <4FE39B94.8010902@KingsMountain.com>
Comments on Content Security Policy 1.0 Editor's Draft
http://dvcs.w3.org/hg/content-security-policy/raw-file/tip/csp-1.0-specification.html


The CSPv1.0 spec has markedly improved over the last year+ set of revisions. 
Hat tip to AdamB and the working group on that.

Although the list of comments below appears somewhat long, they largely concern 
fine-grained aspects of the spec. Overall, I found the spec to be in good shape 
-- it is well organized, readable, essentially complete, etc.

I suggest the below comments be appropriately incorporated/resolved prior to 
going formally into Last Call.

I hope this helps,

=JeffH


Substantive comments:
---------------------


Overall..

Are there any salient differences between HTML5 and HTML4 resources that 
browser implementers will have to code for in terms of implementing CSP policy 
enforcement?  Other than <applet> ?

Also, the <source> and <track> html5 elements have "src=" attrs, but aren't
mentioned in the CSP spec.  should they be mentioned along with <video> and 
<audio> wherever the latter occur ?


the spec should probably mention that it applies to today's state of the world, 
as described by it and it's normative references, at time of publication. Thus 
if someone e.g. invents some new javascript/DOM gizmo that's a possible avenue 
for exploitation, and it gets implemented in some web user agents (new stuff is 
apparently getting added and "shipped" all the time it seems), the CSP spec, 
UAs' CSP implementations, and existing web application CSP policies may not 
account for it.

This raises the question of "does the spec accurately and completely specify, 
as of Last Call exit time, all the various facets of HTML/DOM/javascript that 
are applicable for all the directive enforcement algorithms?" -- it may be good 
to get some appropriate experts (who haven't yet been directly involved in CSP) 
to do a fine-grained double-check (during Last Call say) just to make sure (?).

All the directive sections effectively contain, after their description of the 
ABNF, what I'll term "directive enforcement algorithms", but these aren't 
denoted as such. however "3.2.11 report-uri" includes the overall statement 
"the user agent must use an algorithm equivalent to the following:"   Should 
the other directive sections include such a statement (I could go either way). 
if not, perhaps it is unnecessary in 3.2.11 ?



within "2. Conformance"..

given that we have the one directive, "sandbox", which is marked as "(optional)"...

   3.2.10 sandbox (Optional)

..should the "conformant user agent" definition (perhaps that phrase should be 
in <dfn> element?) thus become..

   A conformant user agent is one that implements all the requirements
   listed in this specification that are applicable to user-agents, and
   MAY implement those marked as "(Optional)".

Plus, such a change might have implications for this sentence from 3.2.10 
Sandbox...

   The sandbox directive is optional. If the user agent does not
   support the sandbox attribute, the user agent must ignore every
   sandbox directive.

..i.e. perhaps it could be simplified (haven't really thought about it though).

Also, the first sentence is..

   As well as sections marked as non-normative, all authoring guidelines,
   diagrams, examples, and notes in this specification are non-normative.

..however, there's four occurrences of "Note:" in various places within overall 
section 3 that are arguably normative, so perhaps "note" should be dropped from 
the above sentence.



within "2.1 Terminology"..

This section, in its five opening paragraphs, defines key concepts (e.g., 
application of security polices to protected resources), not simply 
terminology. Perhaps it should be renamed "Key Concepts and Terminology", or 
split into two subsections.

This para seems somewhat inaccurate/confusing given terms and notions that are 
used elsewhere in the spec..

    "A server transmits its security policy for a particular resource as a 
collection of directives, such as default-src 'self', each of which controls a 
specific set of privileges for a document rendered by the user agent."

   a suggested clarification..

   "A server transmits its security policy for a particular protected resource 
as a collection of directives, such as default-src 'self', each of which 
controls a specific set of privileges for that protected resource as 
instantiated by the user agent."

this section also states..

   "The Augmented Backus-Naur Form (ABNF) notation used in this document
    is specified in RFC 5234. [ABNF]"

However, header field ABNF uses "1#policy". The "#rule" is an ABNF extension 
defined in RFC2616 HTTP (and in httpbis) and thus should be appropriately 
normatively cited.



within "3.1 Policy Delivery"

This section says for both Content-Security-Policy (CSP) and the 
Content-Security-Policy-Report-Only (CSPRO) Header Fields that..

   "A server may send more than one HTTP header field named ... with a given 
resource representation."

However, later in section "3.5 Implementation Considerations" the following is 
stated..

   "administrators ought to remember that the user agent will enforce only the 
first policy statement it receives. To enforce multiple policies, the 
administrator must combine the policy into a single header. "

Plus the ABNF for the header fields is..

   1#policy

..which indicates that a header field can have a comma-separated list of policies..

   Content-Security-Policy: policy, policy, ...

..where..

   policy            = [ directive *( ";" [ directive ] ) ]


Which all seems to imply that for the aforementioned policy combination (of 
multiple occurrences of policies and/or CSP header fields), that they /can/ be 
simply combined by forming a single header field with the polices comma 
separated. However clarifying how that might work in the context of enforcement 
of "the first policy statement it receives" needs to be figured out.

Also, the term "policy statement" occurs only once in the spec in that sentence 
above and thus it isn't clear whether a "policy statement" is the value of a 
"policy" production, or whether it is the CSP/CSPRO header field value.

The above ought to be sorted out and appropriate language added to section "3.1 
Policy", and possibly "3.5 Implementation Considerations", as well as 
appropriate cross-references between the two sections.

Additionally, the final paragraph of section 3.1.2 beginning "A server may 
monitor one policy while enforcing another policy." should arguably be a 
subsection on it's own, i.e. as section 3.1.3.

The spec implies that the CSP/CSPRO header fields are not mutually-exclusive, 
but doesn't explicitly state that.



within "3.1.3.2 Source List"...

in <dfn>source expression</dfn> it says...

   the set of URIs which are in the same origin as the protected resource

for clarity, "origin" here should be a link back to dfn>origin</dfn> in terminology

in the #matches-a-source-expression algorithm, it declares..

   Let scheme, host, and port be the scheme, host, and port of the
   URI, respectively

..where the only contextually declared difference between the two different 
sets of the scheme, host, and port terms being their display style (italics vs 
plain). Thus I found it somewhat difficult to parse the non-trivial 
#matches-a-source-expression algorithm text.

I suggest renaming the first set of scheme, host, and port to something like 
"uri-scheme, uri-host, and uri-port", and retaining their italicized style 
treatment.



within "3.2.2 script-src"..

WRT operator eval, function eval, function Function, setTimeout function, 
setInterval function -- they should probably explicitly note they are 
javascript constructs and reference the appropriate ecmascript spec

given the language the directive enforcement algorithm employs for 
'unsafe-inline' and 'unsafe-eval', i.e. "If <one of them> is not in allowed 
script sources..", then given a directive of say..

   script-src example.com 'unsafe-inline'

..then the 'unsafe-inline' effectively obviates inline script protection, _but_ 
eval will be blocked for all sources other than from example.com.

It might be worth it to explain such nuances more explicitly. Perhaps could 
expand example 3 (which needs more full explanation anyway wrt the 'unsafe-' 
keyword-sources), maybe add an example 4 using script-src and 'unsafe-', and 
reference them from the directive sections that mention the 'unsafe-' 
keyword-sources.




within "3.2.3 object-src"..

where "nested browsing context" is mentioned ought to reference:
<http://dev.w3.org/html5/spec/single-page.html#nested-browsing-contexts>
("nested browsing context" is also mentioned elsewhere so a centralized defn 
might be appropriate)

In the fourth para where it says..

   "It is not required that the consumer of the element's data be a plugin
    in order for the object-src directive to be enforced. Data for any object,
    embed, or applet element must match the allowed object sources in order
    to be fetched. This is true even when the element data is semantically
    equivalent to content which would be restricted by one of the other
    directives, such as an object element with a text/html MIME type."

It isn't clear to me what the implication of the final sentence is -- why does 
it matter if the content "would be" restricted by other directives ?  Does it 
mean I don't need to use the object-src directive in some cases, or that i had 
better use all applicable directives, or something else ?



within "3.2.6 media-src"..

should <source> and <track> html5 elements be mentioned here too?


within 3.2.10 sandbox (Optional)

"forced sandboxing flag set" is presently a simple link to the definition in 
HTML5 spec of that term. I think also having an explicit citation to [HTML5] 
would be helpful for the reader here, perhaps saying something like "Sandboxing 
is defined in [HTML5], and link to 
<http://dev.w3.org/html5/spec/single-page.html#sandboxing>"

Also, it seems that HTML5 doesn't actually define any explicit profiles of 
required flags in a "forced sandboxing flag set", saying only near the end of 
step 20 of "5.6.1 Navigating across documents" that

    "A resource by default has no flags set in its forced sandboxing flag
     set, but other specifications can define that certain flags are set."

So is there an explicit profile of required flags in a "forced sandboxing flag 
set" that CSP can reference and user agents can implement?  Otherwise it seems 
that it will be difficult to be able to implement sandbox enforcement in an 
interoperable fashion.  Or am I misunderstanding this?




within "3.2.11 report-uri"..

Is a "dictionary" a defined JSON concept? the manner in which the term is used 
in this section seems to imply this. if so the concept should be referenced 
wherever it's defined.

I find this confusing (until reading to the Note at the very end)..

   "Fetch the report URI from origin of the protected resource, with the
    synchronous flag not set, using HTTP method POST, with a Content-Type
    header field of application/json with an entity body consisting of the
    violation report. The user agent must not follow redirects when fetching
    this resource. (Note: The user agent ignores the fetched resource.)"

In that we're trying to simply say "send the report to the address given by the 
web origin of the protected resource, using HTTP POST method" (yes?)

i.e. the term "Fetch" threw me off, and here it isn't linked to the HTML5 
"fetching" algorithm.  If in W3C parlance "fetch" is the right thing to use 
even here, perhaps parenthetically explain that this will result in "sending" 
the report.

unfortunately, it seems the HTML5 spec doesn't, other than form submission 
(which is defined it seems such that it's pretty strictly tied to the <form> 
element processing), define an algorithm for "sending" arbitrary data, and thus 
we have the above tortured use of "fetch" to mean "send"  (I suppose this is a 
comment on an arguably missing piece in the HTML5 spec?)



within "3.4 Security Considerations"..

The privacy leakage aspects of the violation reports (c.f. the list discussion) 
should be noted.




Should provide (informative?) reference(s) for:

resource representation

content injection vulnerabilities

cross-site scripting (XSS)

bookmarklets

plugins  -- http://dev.w3.org/html5/spec/single-page.html#plugin




Add to normative references:

CSSOM

DOM ?  ("DOM APIs" are mentioned once in "3.2.9 connect-src")

ecmascript  -- due to normatively referring to eval et al

Extensible Stylesheet Language Transformations (XSLT)

ecmascript

HTTP  -- RFC2616 mentioned in many places, and in ABNF in 3.2.10 sandbox

RFC3986 (URIs) -- is ref'd in ABNF in 3.2.11 report-uri

XML (e.g., for where the spec mentions "XML document")






editorial items:
---------------

overall..

the HTTP spec should be formally referenced. a convenience ref for the 
often-occuring "400 HTTP response" would be nice
a direct link for 400 response is 
<https://tools.ietf.org/html/rfc2616#section-10.4.1>

the header fields defined in this spec are formally "HTTP response header 
fields" and should be designated as such (at least in the first occurrences in 
section 3.1)

eg "The server delivers the policy to the user agent via an HTTP response 
header" should be "The server delivers the policy to the user agent via an HTTP 
response header field."


in terminology, some referenced specs are referred to as "specification", 
others are referred to as "standard" -- should pick one for consistency (i vote 
for "specification").


the term "document" is used in a few places where it arguably should be 
"resource", "resource representation", or "protected resource"
  - an occurrence is in "2.1 terminology" and is already corrected above
  - two occurrences in "3.3.2 Sample Violation Report"
  - an occurrence in "3.4.1 Cascading Style Sheet (CSS) Parsing"


within "1. Introduction"..

"the server need to supply" should be "the server needs to supply"



within "2.1 Terminology"..

[Origin] should point to RFC6454 :)  and be added to the Normative References list.

[WEBSOCKET], [EVENTSOURCE] should be added to the Normative References list.

expand "CSS" acronym




within #matches-a-source-expression..

"If the source expression a single U+002A ASTERISK"  should be "If the source 
expression consists of a single U+002A ASTERISK" ?

"if the URI has the same scheme, host, and port as the resource's URI" should 
be "if the URI has the same scheme, host, and port as the protected resource's 
URI" ?



within "3.1.4 Processing Model"..

"Generally speaking, enforcing a directive prevent"  should be "Generally 
speaking, enforcing a directive prevents" ?


"enforcing the policy will break the web application"  could be "enforcing the 
policy will cause the web application to malfunction"

"Whenever a user agent runs a worker"
Should probably have a formal reference to <http://www.w3.org/TR/workers/> spec.



within "3.2 Directives"..

"In order to protect against Cross-site Scripting (XSS), authors should 
include"    ..should more clearly be..

"In order to protect against Cross-site Scripting (XSS), web application 
authors should include in their CSP policies:"



within "3.2.2 script-src"..

reference for XSLT and "stylesheets" ?




within "3.2.4 style-src"..

CSSOM ?  expansion & reference?




within "3.2.7 frame-src"..

html5 has a specific subsection entitled "Navigating nested browsing contexts 
in the DOM", perhaps should reference?



within "3.2.11 report-uri"..

"Prepare a dictionary violation dictionary" should be just "Prepare a violation 
dictionary" (even though the first occurance of dictionary is plain text and 
"violation dictionary" is italics)






Nits:
-----


section "3.1.3 Syntax" should perhaps be "3.1.3 Syntax and Algorithms"

"parse a CSP policy" should be a section (so it shows up in the TOC)

"parse a source list" should be a section (ditto)

"URI matches a source expression" should be a section (ditto)

"URI matches a source list" should be a section (ditto)


---
end
Received on Thursday, 21 June 2012 22:09:56 UTC