Re: allowable characters in token as used in parameter ABNF

Speaking personally --

I'm torn here. On one hand, I'd very much like to see best practice promoted here, because the wild-west situation of HTTP header parsing is one of the things I really dislike, and suspect causes a lot of problems.

OTOH, we don't have any implementers stepping up and saying that they're eager, and in this situation it may be too easy to specify the wrong thing.

AIUI the most liberal form of xtoken would be 1*VCHAR without DQUOTE, "," or ";". Correct?

If we can get agreement to that, I think we could document that as the construct for link-extension in the link draft, and perhaps recommend it in section 4 of Julian's draft when talking about how to accommodate extensibility. 

Beyond that, we have a bit more time to figure out if it's useful in httpbis as well.

Thoughts?



On 06/02/2010, at 2:40 AM, Julian Reschke wrote:

> Hi,
> 
> this came up while discussing Mark's Link Header draft.
> 
> For link extensions, it currently uses
> 
>  link-extension    = token [ "=" ( token | quoted-string ) ]
> 
> (<http://greenbytes.de/tech/webdav/draft-nottingham-http-link-header-07.html#rfc.section.5>)
> 
> This is consistent with HTTP/1.1's use of parameters:
> 
>  parameter               = attribute "=" value
>  attribute               = token
>  value                   = token | quoted-string
> 
> (<http://greenbytes.de/tech/webdav/rfc2616.html#rfc.section.3.6>)
> 
> (Reminder: this is the old RFC2616 ABNF style, allowing LWS between tokens and words etc)
> 
> However, for the (non-extended) parameter "type", the draft says:
> 
>  "type" "=" type-name "/" subtype-name
> 
> ...so it uses a syntax that wouldn't be allowed for link-extensions ("/" is not allowed in tokens.
> 
> Testing with the two UAs that do something with the Link header (Opera and FF) show that they happily accept both quoted and unquoted type parameters.
> 
> So minimally,
> 
>  "type" "=" DQUOT type-name "/" subtype-name DQUOT
> 
> should be allowed as well.
> 
> However, special-casing the syntax for predefined parameters feels lame -- it would be good if you could use a generic parser to get all components.
> 
> So one way out of this would be to require that *all* parameters follow the pattern
> 
>  parameter               = attribute "=" value
>  attribute               = token
>  value                   = token | quoted-string
> 
> From a purity point of view, that probably would be best. However, we have evidence (see above) that some implementations do not require quoted-string for certain characters that *are* forbidden in token.
> 
> Of course this could be addressed by saying "be lenient in what you accept", and be done with it.
> 
> Another approach that we'd like to discuss is to widen the set of characters that are allowed in unquoted parameter values, such as
> 
>  parameter               = attribute "=" value
>  attribute               = token
>  value                   = xtoken | quoted-string
> 
> and make xtoken extend token, also allowing certain harmless characters, such as "/".
> 
> If we do this, we'd really like to do this consistently in both draft-nottingham-http-link-header and draft-reschke-rfc2231-in-http, and potentially even define it in HTTPbis (maybe just as a recommended syntax component for new header fields).
> 
> Feedback appreciated,
> 
> Julian
> 
> 


--
Mark Nottingham     http://www.mnot.net/

Received on Tuesday, 9 February 2010 12:37:10 UTC