Re: is Accept header BNF ambiguous ?

Anselm Baird-Smith:
>
>I have started writing the parsers for the 1.1 headers, starting with
>the beginning, I am trying to parse the Accept header, whose
>definition is given by (section 14.1):
>
>       Accept         = "Accept" ":" #(
>                             media-range
>                             [ ( ":" | ";" )
>                               range-parameter
>                               *( ";" range-parameter ) ]
>                            | extension-token )
>

[...]

>If the media-range is separated from the range-parameter by a ':',
>then I am happy, everything is fine (note that for 1.1 server this is
>only a SHOULD, not a MUST, I can't see why).

Some of us just discussed this SHOULD in a phone conference, and we
can't see why it is not a MUST either.  The 05 draft will have a MUST
in 14.1:

#In Accept headers sent by HTTP/1.1 clients, the character separating
#media- ranges from range-parameters MUST be a ":".  HTTP/1.1 servers
                                     ^^^^
#SHOULD be tolerant of use of the ";" separator by HTTP/1.0 clients.

> However, if media-range
>is separated from range-parameter by a ';' (as some clients did in
>HTTP/1.0), then I have no ways to know wether the given parameter is
>to be attached to the media-range clause rather then to the
>range-parameter one.

Yes, this is the problem with 1.0 clients.  You can't unambiguously
parse

  image/x-blah; p=4; q=0.4

so HTTP/1.1 does not require it.  It only requires that you `are
tolerant' of its use, which is sufficiently vague to allow just about
anything except generating an error message the 1.0 client would not
have gotten if it had sent

  image/x-blah; p=4: q=0.4

[...]

>More pragmatically when I am parsing:
>
>Accept: text/html;x=1;y=2
>
>I have to be able to select one of the following interpretations:
>
>a) media-range=text/html;x=1;y=2
>   range-paramer=EMPTY
>b) media-range=text/html;x=1
>   range-parameter=y=2
>c) media-range=text/html
>   range-parameter=x=1;y=2
>
>And I don't know how to do it.

Just parse in the most convenient way which yields tolerant behavior.

The best heuristic would be to look for a `q=' and cut just before it,
but in general I would not bother implementing it.

>Anselm.

Koen.

Received on Thursday, 6 June 1996 16:21:18 UTC