- From: Willy Tarreau <w@1wt.eu>
- Date: Mon, 14 May 2012 23:48:40 +0200
- To: John Sullivan <jsullivan@velocix.com>
- Cc: Andreas Petersson <andreas@sbin.se>, Amos Jeffries <squid3@treenet.co.nz>, ietf-http-wg@w3.org
On Mon, May 14, 2012 at 02:20:22PM +0100, John Sullivan wrote: > Willy Tarreau wrote: > > That's a good point. I must say I've never seen a Location or Referer > > header being quoted despite their wide use of "[/:]" which are marked > > as special chars. I'll raise a new issue on this subject. > > That is because those headers are defined as: > > Referer = absolute-URI / partial-URI > Location = URI-reference > Content-Location = absolute-URI / partial-URI > > The allowed character set and escaping rules are those that apply > to those productions from RFC 3986. Those particular characters may > have special meaning within a URI, depending on their position and > URI construction, and must be % HEX HEX encoded to avoid said special > meaning, but the header values do not need to be otherwise quoted or > escaped beyond the [URI] rules. > > The Forwarded header uses: > > Forwarded-v = 1#( token "=" ( token / quoted-string ) *( ";" ... ) ) > > Compare this with Cache-Control (which is the same but without ";" > parameters) and Content-Type/Accept (which is the same except with an > initial value that is a ( type "/" subtype ) media type, including a > literal "/" character. > > So if the desired value doesn't "fit into" token, it must be a > quoted-string, where "\" is used as an escape character. Also as > httpbis puts it: > > A parameter value that matches the token production can be > transmitted as either a token or within a quoted-string. The quoted > and unquoted values are equivalent. > > Quite a different grammar! > > Variations on that theme have sufficient use within RFC 2616/httpbis > that I think it's good to use something broadly similar. People > operating in that area ought to have the component parsers readily > available and understand their use well enough to compose them into > a total parser for the grammar defined here. OK but I mean, I've been used to send IPv6 X-Forwarded-For headers without quotes and have seen some of these headers sent with a port number without quotes either. So maybe we're causing quotes to become necessary only because we just allow token / quoted string in the value then ? Maybe it would make sense to enlarge the allowed character set for values in order to avoid making quoted-strings necessary for most usages ? > It could be argued that by limiting values to quoted-string alone any > required value can be represented and one cuts off one avenue for > confusion/incompatibility, but I think defining a new base character > set should be done only as a last resort. OK in theory but in practice I'm fairly sure we'll see IPv6 addresses sent unquoted because a number of implementations will not have noticed they became mandatory. That's why sometimes widening a character set to better fit what it is supposed to represent makes a lot of sense. This can even be done by slightly extending the grammar : Forwarded-v = 1#( token "=" ( ipv4 / ipv6 / token / quoted-string ) *( ";" ... ) ) Regards, Willy
Received on Monday, 14 May 2012 21:50:39 UTC