- From: Karl Dubost <karl@la-grange.net>
- Date: Sun, 3 Mar 2013 14:03:56 -0500
- To: HTTP Working Group <ietf-http-wg@w3.org>
Hi,
This is a bit long but it came because we were trying to fix a bug into python library for HTTP headers and production rules.
request.add_header('foo', 'bar')
→ "foo:bar"
request.add_header(' foo', 'bar')
→ " foo:bar"
request.add_header('foo ', 'bar')
→ "foo :bar"
What I gathered from the spec:
In 3.2. Header Fields,
http://tools.ietf.org/html/draft-ietf-httpbis-p1-messaging-22#section-3.2
the header production rules are defined as:
header-field = field-name ":" OWS field-value BWS
field-name = token
field-value = *( field-content / obs-fold )
field-content = *( HTAB / SP / VCHAR / obs-text )
obs-fold = CRLF ( SP / HTAB )
; obsolete line folding
; see Section 3.2.4
The field-name token labels the corresponding field-value as having
the semantics defined by that header field.
So far so good, but we do not know what are the production rules for "field-name = token". It might come later. Let's read a bit more.
In 3.2.3, Whitespace, there are production rules for OWS, BWS, and RWS:
http://tools.ietf.org/html/draft-ietf-httpbis-p1-messaging-22#section-3.2.3
This defines at least the rules for
header-field = field-name ":" OWS field-value BWS
which says basically.
--------------------------------------------------
OK "Foo: bar" (1 space or more)
OK "Foo: bar" (1 tab or more)
OK "Foo:bar" (no space)
--------------------------------------------------
AVOID "Foo: bar " (1 trailing space or more)
AVOID "Foo: bar " (1 trailing tab or more)
--------------------------------------------------
AVOID means:
* senders SHOULD NOT generate it in messages.
* recipients MUST accept such bad optional whitespace and remove it
before interpreting the field value or forwarding the message
downstream.
ok cool. Let's go on.
In 3.2.4 Field Parsing
http://tools.ietf.org/html/draft-ietf-httpbis-p1-messaging-22#section-3.2.4
No whitespace is allowed between the header field-name and colon.
So In production rules, we can't do:
-------------------------------------------------------
BAD "Foo :bar" (1 or more space/tab before the ":")
-------------------------------------------------------
In
the past, differences in the handling of such whitespace have led to
security vulnerabilities in request routing and response handling. A
server MUST reject any received request message that contains
whitespace between a header field-name and colon with a response code
of 400 (Bad Request).
OK. This is clear too. Tested on W3C Server,
→ curl -I -H "foo :bar" --trace-ascii - http://www.w3.org/
W3C server sent back
HTTP/1.0 400 Bad request
Though not all servers do that:
→ curl -I -H "foo :bar" --trace-ascii - http://www.ietf.org/
HTTP/1.1 200 OK
There is a rule also for proxies:
A proxy MUST remove any such whitespace from a
response message before forwarding the message downstream.
-------------------------------------------------------
"foo :bar" → "foo:bar"
-------------------------------------------------------
MY QUESTION (finally) :)
Nothing is said about
-------------------------------------------------------
" foo:bar" (1 or more space/tab before the fied-name)
-------------------------------------------------------
In appendix C, the ABNF defines token for:
http://tools.ietf.org/html/draft-ietf-httpbis-p1-messaging-22#appendix-C
The section of the spec saying
field-name = token
with
token = 1*tchar
and tchar as
tchar = "!" / "#" / "$" / "%" / "&" / "'" / "*" / "+" / "-" / "." /
"^" / "_" / "`" / "|" / "~" / DIGIT / ALPHA
So the production rules forbid a leading space, but nothing is said about parsing this leading space.
* Should it say something?
* If yes, what?
* If not, why?
--
Karl Dubost
http://www.la-grange.net/karl/
Received on Sunday, 3 March 2013 19:03:58 UTC