Re: Comments on the HTTP Sec-From Header (draft-abarth-origin) from =JeffH on 2009-07-14 (ietf-http-wg@w3.org from July to September 2009)

From: =JeffH <Jeff.Hodges@KingsMountain.com>
Date: Tue, 14 Jul 2009 16:59:37 -0700
To: HTTP Working Group <ietf-http-wg@w3.org>
Message-ID: <4A5D1BE9.7020007@KingsMountain.com>
[these comments are intended to be complimentary to the ones by Mark Nottingham]


substantive questions/comments:

0. when using term "host" in the spec (e.g. in Sec 2), do you syntactically 
mean <host> from [RFC3986]...

    host          = IP-literal / IPv4address / reg-name

?   (see also comment 6 below)


1. the spec should have a section defining the sec-from header per se 
(presently it's only introduced in context in Sec 5), and its syntax in ABNF 
[RFC5234]. Here's a quick example of the latter (dunno if exactly correct, and 
depends on syntax decisions you make)...


    sec-from       = "sec-from" ":" 1*WSP [ "null" / origin-list ]

    origin-list    = delim-serialized-origin *( 1*WSP delim-serialized-origin )

    delim-serialized-origin = "<" serialized-origin ">"

    serialized-origin       = scheme "://" host [ ":" port ]

                            ; <scheme>, <host>, <port> productions from RFC3986



2. wrt question on list re how to delimit variable names in text -- if you use 
ABNF, then there is IETF precedence for using <rulename> in text (see [RFC2616] 
for example; plus there may be such precedence w/o explicitly using ABNF in a 
spec).


3. given the ABNF, it may be "easier" to construct a declarative style spec, 
but it wouldn't necessarily rule out algorithmic style (will help make the 
latter more clear imv).


4. in sections 4 & 5, U+006E, U+0075, U+006C, U+006C ("null") is properly not a 
"character sequence" but a "code point sequence", yes? Thus the actual 
serialized octet sequence for such a sequence (string) will differ depending on 
the actual encoding used, e.g. utf-8 vs utf-16, AFAIK. Also, HTTP header fields 
default to using US-ASCII characters unless the header field value is 
explicitly defined to be of type *TEXT. Thus I suggest that the "null" in the 
<sec-from> production above is sufficient and correct (it is US-ASCII encoded 
per RFC5234), and the U+006E, U+0075, U+006C, U+006C code points should be 
removed from the spec.


6. in sections 4.1 & 4.2, the spec references the "IDNA ToUnicode algorithm" 
and the "IDNA ToUnicode algorithm" (both of [RFC3490], which needs to be cited, 
too). This is confusing because the proper reference point in RFC3490 is 
Section 4 "Conversion Operations" which contains both of those algorithms, as 
well as specifying steps that need to be taken around invocation of either of 
those algorithms.

Additionally, RFC3940 is only about DNS domain names, and so the results of 
running <host> values matching the <IP-literal> or <IPv4address> productions 
(RFC3986) through it are undefined AFAIK (I asked Paul Hoffman today and that's 
what he said).

Here's step 4 of your sec 4.2 re-written to accommodate the above...

    4.  If the value of the <host> origin tuple component is
        in the form of a DNS domain name, apply the IDNA conversion
        algorithm ([RFC3490] section 4) to it, with both the
        AllowUnassigned and UseSTD3ASCIIRules flags set, employ
        the ToASCII operation, and append the result to /result/.

..though the overall algorithm in draft-abarth-origin may need further tweaking 
wrt running IP addresses thru it.


7. Section 5 -- "privacy-sensitive" context is undefined. It is implicitly 
vaguely defined in sec 7. Also, assuming a definition exists, how does some 
given UA "know" whether it is "in" a privacy-sensitive context ?


8. We should define exactly what is meant by "an HTTP redirect from URI" foo 
(sec 5). E.g. given foo is the Request-URI in the Request-Line of an HTTP 
message sent to a host:port, then "HTTP redirect from URI foo" is the returned 
HTTP response message from said host:port containing a 3xx Status Code (all 
terms RFC2616).


9. In the 2nd Step 2 in Sec 5, where it says "...unless
        this would result in the header containing the origin
        serialization "null". "

Do you mean to say..

                                                ...unless
        this would result in the header containing the origin
        serialization "null" as any component value?

?


10. Until I very carefully re-read the algorithm and its immediately preceding 
paragraph in Sec 6, I didn't realize that the "[MUST NOT / MAY] modify state" 
return values _are not_ intended to be sent over the wire to the client.

Could suggest mods, but want to wait until decision is made on whether to 
re-write spec in non-algorithmic style.


11. in sec 7 "privacy considerations" this claim is made..

   "The Sec-From header also improves on the Referer header by NOT
    leaking intranet host names to external Web sites..."

..but isn't it not the "sec-from" header per se that is doing this, rather, 
it's the overall behavior/policy specified by the spec that's providing such 
protection, depending though upon yet-to-be-defined means for determining 
whether any given HTTP request generated by a UA is "in a privacy-sensitive 
context" or is "a privacy-sensitive request" ?


12. sec 8 "sec cons" says in part, "This design prevents an attacker from 
making a supporting user agent appear to be a non-supporting user agent."

But this is only a certain class of attacker, not _any_ attacker (I presume). 
The former being the class that controls some website that issues CSRF requests 
(via whatever means (<form>, .js, etc.)), but otherwise has no means to subvert 
the UA, e.g. turn off sec-from issuance. The latter of course relies on correct 
UA implementation.

Probably worthwhile to have a "threat analysis" (sub)section along with a 
description of CSRF (as suggested below).






Editorial comments:

1. I'd have both normative & informative references, and in the latter include 
a cite for at least..

   Robust Defenses for Cross-Site Request Forgery
   http://www.adambarth.com/papers/2008/barth-jackson-mitchell-b.pdf

..also perhaps cite..

   http://www.cgisecurity.com/csrf-faq.html


2. In addition to Mark's comment of..

   It would be helpful to give more information about the use case and
   applicability of this header in the Introduction.

..I'd add that a clear and detailed description of CSRF itself would be quite 
helpful.


3. "Unicode Serialization of an Origin" is not employed in the spec as yet -- 
if it remains unused, I'd explain that it is there for reference, e.g. from 
other specs, if needed. (and if there's reasonable expectation of needing it in 
future (otherwise remove it)).


4. various minor spelling and grammar errors (e.g. missing words), but that can 
be cleaned up at a later stage.


HTH,

=JeffH
Received on Wednesday, 15 July 2009 00:00:14 UTC