W3C home > Mailing lists > Public > ietf-http-wg@w3.org > January to March 2013

Header field-name token and leading spaces

From: Karl Dubost <karl@la-grange.net>
Date: Sun, 3 Mar 2013 14:03:56 -0500
Message-Id: <02D76B3C-BED4-4E6C-BF72-6ED327FF72E8@la-grange.net>
To: HTTP Working Group <ietf-http-wg@w3.org>

This is a bit long but it came because we were trying to fix a bug into python library for HTTP headers and production rules.

request.add_header('foo', 'bar')
→ "foo:bar"
request.add_header(' foo', 'bar')
→ " foo:bar"
request.add_header('foo ', 'bar')
→ "foo :bar"

What I gathered from the spec:

In 3.2.  Header Fields, 

the header production rules are defined as:

     header-field   = field-name ":" OWS field-value BWS
     field-name     = token
     field-value    = *( field-content / obs-fold )
     field-content  = *( HTAB / SP / VCHAR / obs-text )
     obs-fold       = CRLF ( SP / HTAB )
                    ; obsolete line folding
                    ; see Section 3.2.4

   The field-name token labels the corresponding field-value as having
   the semantics defined by that header field.

So far so good, but we do not know what are the production rules for "field-name     = token". It might come later. Let's read a bit more. 

In 3.2.3, Whitespace, there are production rules for OWS, BWS, and RWS:

This defines at least the rules for 

     header-field   = field-name ":" OWS field-value BWS

which says basically.

OK    "Foo: bar"       (1 space or more)
OK    "Foo:	bar"   (1 tab or more)
OK    "Foo:bar"        (no space)
AVOID "Foo: bar "      (1 trailing space or more)
AVOID "Foo: bar "      (1 trailing tab or more)

AVOID means:

* senders SHOULD NOT generate it in messages.
* recipients MUST accept such bad optional whitespace and remove it
  before interpreting the field value or forwarding the message

ok cool. Let's go on.

In 3.2.4 Field Parsing

   No whitespace is allowed between the header field-name and colon.  

So In production rules, we can't do:

BAD  "Foo :bar"    (1 or more space/tab before the ":")

   the past, differences in the handling of such whitespace have led to
   security vulnerabilities in request routing and response handling.  A
   server MUST reject any received request message that contains
   whitespace between a header field-name and colon with a response code
   of 400 (Bad Request).

OK. This is clear too. Tested on W3C Server, 

→ curl -I -H "foo :bar" --trace-ascii - http://www.w3.org/

W3C server sent back 

    HTTP/1.0 400 Bad request

Though not all servers do that:

→ curl -I -H "foo :bar" --trace-ascii - http://www.ietf.org/
HTTP/1.1 200 OK

There is a rule also for proxies:

  A proxy MUST remove any such whitespace from a
   response message before forwarding the message downstream.

"foo :bar" → "foo:bar"

MY QUESTION (finally) :) 

Nothing is said about 
" foo:bar"   (1 or more space/tab before the fied-name)

In appendix C, the ABNF defines token for:

The section of the spec saying

     field-name     = token


   token = 1*tchar

and tchar as 

   tchar = "!" / "#" / "$" / "%" / "&" / "'" / "*" / "+" / "-" / "." /
    "^" / "_" / "`" / "|" / "~" / DIGIT / ALPHA

So the production rules forbid a leading space, but nothing is said about parsing this leading space. 

* Should it say something? 
* If yes, what? 
* If not, why?

Karl Dubost
Received on Sunday, 3 March 2013 19:03:58 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Sunday, 3 March 2013 19:04:01 GMT