- From: Julian Reschke <julian.reschke@gmx.de>
- Date: Fri, 23 May 2008 15:19:09 +0200
- To: Bjoern Hoehrmann <derhoermi@gmx.net>
- CC: HTTP Working Group <ietf-http-wg@w3.org>
Bjoern Hoehrmann wrote:
> * Julian Reschke wrote:
>> Let's take an example, such as Accept-Charset:
>>
>> Accept-Charset = "Accept-Charset" ":"
>> 1#( ( charset | "*" ) [ ";" "q" "=" qvalue ] )
>>
>> (<http://greenbytes.de/tech/webdav/draft-ietf-httpbis-p3-payload-02.html#rfc.section.6.2>)
>>
>> A mechanical translation would yield:
>>
>> Accept-Charset = "Accept-Charset" ":"
>> ( *LWS ( charset / "*" ) [ ";q=" qvalue ]
>> *( *LWS "," *LWS ( charset / "*" ) [ ";q=" qvalue ] ) )
>>
>> (hopefully).
>
> There are several differences here in what values the two allow; you did
> not call them out so I am not sure whether they are intentional. In par-
> ticular these are valid under the old production but not under yours:
>
> Accept-Charset: utf-8,,*
Good catch.
It seems to me that
"Wherever this construct is used, null elements are allowed, but do not
contribute to the count of elements present. That is, "(element), ,
(element) " is permitted, but counts as only two elements. Therefore,
where at least one element is required, at least one non-null element
MUST be present. Default values are 0 and infinity so that "#element"
allows any number, including zero; "1#element" requires at least one;
and "1#2element" allows one or two."
doesn't translate well into ABNF syntax. So even if we said:
Accept-Charset = "Accept-Charset" ":"
( *LWS ( charset / "*" ) [ ";q=" qvalue ]
*( *LWS "," *LWS [ ( charset / "*" ) [ ";q=" qvalue ] ] ) )
that would allow
Accept-Charset: utf-8,,*
but not
Accept-Charset: ,,utf-8
which is valid in RFC2616. So we need to handle leading ("," *LWS)
separately...
> Accept-Charset: utf-8 ; q = ...
That's mistake I made when pasting bap's output back into the mail (bap
doesn't know about implied LWS). So it should have read:
Accept-Charset = "Accept-Charset" ":"
( *LWS ( charset / "*" ) [ ";" "q" "=" qvalue ]
*( *LWS "," *LWS ( charset / "*" ) [ ";" "q" "=" qvalue ] ) )
> I'm not sure whether your new production should be read assuming implied
> linear white space, if not there are a number of additional differences,
> and if so, then the production is more complex than would be necessary.
I was trying to get the list rule issue resolved first; of course the
implied LWS needs to be resolved as well.
> It would certainly be wise to factor repeated productions out into sepa-
> rate productions, yes.
So, combining this, but still ignoring implied LWS, we'd get:
AC-f = ( ( charset | "*" )[ ";" "q" "=" qvalue ] )
AC-e = *LWS AC-f
Accept-Charset = "Accept-Charset" ":" *( *LWS "," ) AC-e *( *LWS ","
[ AC-e ])
Or...
AC-f = ( ( charset | "*" )[ ";" "q" "=" qvalue ] )
AC-e = *LWS AC-f
COMMA = *LWS ","
Accept-Charset = "Accept-Charset" ":" *COMMA AC-e *( COMMA [ AC-e ])
The more I look into this, the better the original syntax looks :-)
BR, Julian
Received on Friday, 23 May 2008 13:19:57 UTC