W3C home > Mailing lists > Public > ietf-http-wg@w3.org > April to June 2012

Re: WGLC: draft-ietf-appsawg-http-forwarded-02.txt - section 5.1

From: Willy Tarreau <w@1wt.eu>
Date: Mon, 14 May 2012 23:48:40 +0200
To: John Sullivan <jsullivan@velocix.com>
Cc: Andreas Petersson <andreas@sbin.se>, Amos Jeffries <squid3@treenet.co.nz>, ietf-http-wg@w3.org
Message-ID: <20120514214840.GM1694@1wt.eu>
On Mon, May 14, 2012 at 02:20:22PM +0100, John Sullivan wrote:
> Willy Tarreau wrote:
> > That's a good point. I must say I've never seen a Location or Referer
> > header being quoted despite their wide use of "[/:]" which are marked
> > as special chars. I'll raise a new issue on this subject.
> 
> That is because those headers are defined as:
> 
>    Referer = absolute-URI / partial-URI
>    Location = URI-reference
>    Content-Location = absolute-URI / partial-URI
> 
> The allowed character set and escaping rules are those that apply
> to those productions from RFC 3986. Those particular characters may
> have special meaning within a URI, depending on their position and
> URI construction, and must be % HEX HEX encoded to avoid said special
> meaning, but the header values do not need to be otherwise quoted or
> escaped beyond the [URI] rules.
> 
> The Forwarded header uses:
> 
>    Forwarded-v = 1#( token "=" ( token / quoted-string ) *( ";" ... ) )
> 
> Compare this with Cache-Control (which is the same but without ";"
> parameters) and Content-Type/Accept (which is the same except with an
> initial value that is a ( type "/" subtype ) media type, including a
> literal "/" character.
> 
> So if the desired value doesn't "fit into" token, it must be a
> quoted-string, where "\" is used as an escape character. Also as
> httpbis puts it:
> 
>    A parameter value that matches the token production can be
>    transmitted as either a token or within a quoted-string.  The quoted
>    and unquoted values are equivalent.
> 
> Quite a different grammar!
> 
> Variations on that theme have sufficient use within RFC 2616/httpbis
> that I think it's good to use something broadly similar. People
> operating in that area ought to have the component parsers readily
> available and understand their use well enough to compose them into
> a total parser for the grammar defined here.

OK but I mean, I've been used to send IPv6 X-Forwarded-For headers
without quotes and have seen some of these headers sent with a port
number without quotes either. So maybe we're causing quotes to become
necessary only because we just allow token / quoted string in the
value then ? Maybe it would make sense to enlarge the allowed character
set for values in order to avoid making quoted-strings necessary for
most usages ?

> It could be argued that by limiting values to quoted-string alone any
> required value can be represented and one cuts off one avenue for
> confusion/incompatibility, but I think defining a new base character
> set should be done only as a last resort.

OK in theory but in practice I'm fairly sure we'll see IPv6 addresses
sent unquoted because a number of implementations will not have noticed
they became mandatory. That's why sometimes widening a character set to
better fit what it is supposed to represent makes a lot of sense. This
can even be done by slightly extending the grammar :

   Forwarded-v = 1#( token "=" ( ipv4 / ipv6 / token / quoted-string ) *( ";" ... ) )

Regards,
Willy
Received on Monday, 14 May 2012 21:50:39 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 14 May 2012 21:50:46 GMT