Re: #307 (untangle Cache-Control ABNF)

On 2012-06-14 05:23, Mark Nottingham wrote:
>
> On 13/06/2012, at 8:55 PM, Julian Reschke wrote:
>>>>
>>>>    Cache directives are identified by a token, to be compared case-
>>>>    insensitively, and have an optional argument, that can use both token
>>>>    and quoted-string syntax.  For the directives defined below that
>>>>    define arguments, recipients ought to accept both forms, even if one
>>>>    is documented to be preferred.  For any other directive, recipients
>>>>    MUST accept both forms.
>>>
>>> You've managed to qualify it for those defined below, but not for "any other directive."
>>
>> Indeed... How about:
>>
>>    Cache directives are identified by a token, to be compared case-
>>    insensitively, and have an optional argument, that can use both token
>>    and quoted-string syntax.  For the directives defined below that
>>    define arguments, recipients ought to accept both forms, even if one
>>    is documented to be preferred.  For any directive not defined by this
>>    specification, recipients MUST accept both forms.
>>
>> ? (<http://trac.tools.ietf.org/wg/httpbis/trac/attachment/ticket/307/307.4.diff>)
>
> Still managing to avoid it. The last sentence needs to say something like:
>
> """
> For any other directive that accepts an argument, recipients MUST accept both forms.
> """

I believe that all new directives should not only be parseable with a 
generic parser, but also that senders can *rely* on them being parsed 
that way; see James' explanation.

>> It means that you add new special cases to something that is already unnecessarily complex. Why would you want to do that?
>
> What special cases?
>
> For example, if I define a new directive that takes an URL as its argument, I might decide to say that it should ONLY use the quoted form, because I know that people will mess it up if they use token. How does that make things more complex? Your generic parser works the same.

It's a special case in libraries that *construct* the header field. To 
conform to the SHOULD requirement, they'd have to understand the new 
directive.

>>>>> * Isn't the requirement more appropriate in 3.2.3 Cache Control Extensions?
>>>>
>>>> The requirement is about how to parse the header field; it's not specific to extensions. Even if you do not understand a single extension (and do not plan to), you still have to skip over them properly while parsing.
>>>
>>> I see. I understand what you're trying to do, but I think it's too easy to misread it as "all new directives must be defined to allow both forms", which is both unnecessary and not realistic. Some rewording might help, but I question whether it's really necessary to put requirements around this; as you say, reading the BNF should make it clear what a generic parser needs to do.
>>
>> No, that's actually the *intent*. I believe it's both necessary *and* realistic.
>
>
> So, I didn't see what you're trying to do. However, above you say it's not specific to extensions, but as you say later, that's the effect (and intent).

Optimally, it would apply to all directives, but we can't do that 
without making existing code non-compliant (although I'm pretty sure 
that most existing code is non-compliant anyway).

> I think it's going too far; in similar situations we haven't laid down such draconian rules.

I would call that the opposite of draconian; I want to make *more* 
things conforming on the recipient side.

> E.g., we don't place ANY conformance requirements on new headers; it's all advice: <https://svn.tools.ietf.org/svn/wg/httpbis/draft-ietf-httpbis/latest/p2-semantics.html#considerations.for.creating.header.fields>

Optimally we would, but we don't have a mandate for that.

> Likewise for auth schemes: <https://svn.tools.ietf.org/svn/wg/httpbis/draft-ietf-httpbis/latest/p7-auth.html#considerations.for.new.authentication.schemes>

That is incorrect. We've had these requirements all the time based on 
the ABNF.

Looking at RFC 2616:

Cache-Control   = "Cache-Control" ":" 1#cache-directive

    cache-directive = cache-request-directive
         | cache-response-directive

    cache-request-directive =
           "no-cache"                          ; Section 14.9.1
         | "no-store"                          ; Section 14.9.2
         | "max-age" "=" delta-seconds         ; Section 14.9.3, 14.9.4
         | "max-stale" [ "=" delta-seconds ]   ; Section 14.9.3
         | "min-fresh" "=" delta-seconds       ; Section 14.9.3
         | "no-transform"                      ; Section 14.9.5
         | "only-if-cached"                    ; Section 14.9.4
         | cache-extension                     ; Section 14.9.6

     cache-response-directive =
           "public"                               ; Section 14.9.1
         | "private" [ "=" <"> 1#field-name <"> ] ; Section 14.9.1
         | "no-cache" [ "=" <"> 1#field-name <"> ]; Section 14.9.1
         | "no-store"                             ; Section 14.9.2
         | "no-transform"                         ; Section 14.9.5
         | "must-revalidate"                      ; Section 14.9.4
         | "proxy-revalidate"                     ; Section 14.9.4
         | "max-age" "=" delta-seconds            ; Section 14.9.3
         | "s-maxage" "=" delta-seconds           ; Section 14.9.3
         | cache-extension                        ; Section 14.9.6

    cache-extension = token [ "=" ( token | quoted-string ) ]

So for any field not defined by RFC 2616, recipients always had to 
accept none/token/quoted-string. This is not new.

> We don't place any conformance requirements on media type parameters either: <https://svn.tools.ietf.org/svn/wg/httpbis/draft-ietf-httpbis/latest/p2-semantics.html#media.types>; we only note that they can be transmitted in either form. Why not use similar language here?

It says:

> The type/subtype MAY be followed by parameters in the form of attribute/value pairs.
>
>   parameter      = attribute "=" value
>   attribute      = token
>   value          = word
>
> The type, subtype, and parameter attribute names are case-insensitive. Parameter values might or might not be case-sensitive, depending on the semantics of the parameter name. The presence or absence of a parameter might be significant to the processing of a media-type, depending on its definition within the media type registry.
>
> A parameter value that matches the token production can be transmitted as either a token or within a quoted-string. The quoted and unquoted values are equivalent.

So I don't see any language here that would allow a parameter to 
restrict the serialization within the header field.

> What I'd really like, though, is to hear what someone else things, so this isn't just a back-and-forth between Julian and I. Anyone?

Yes, more feedback appreciated. I think it's important to be as 
consistent as possible (without making existing code non-compliant), 
otherwise it's hard to see how we can argue for consistency and 
simplicity in new header fields (STS comes to mind).

Best regards, Julian

Received on Thursday, 14 June 2012 07:30:20 UTC