Re: header parsing, trailing OWS from David Morris on 2009-10-07 (ietf-http-wg@w3.org from October to December 2009)

From: David Morris <dwm@xpasc.com>
Date: Wed, 7 Oct 2009 12:35:09 -0700 (PDT)
To: Julian Reschke <julian.reschke@gmx.de>
cc: ietf-http-wg@w3.org
Message-ID: <Pine.LNX.4.64.0910071211030.31234@egate.xpasc.com>

On Wed, 7 Oct 2009, Julian Reschke wrote:
> David Morris wrote:
>> On Thu, 24 Sep 2009, Julian Reschke wrote:
>> 
>>> In the current edits, the last 'MAY' is a 'SHOULD', which makes it read
>>> 
>>> "A field value MAY be preceded by optional whitespace (OWS); a single SP 
>>> is preferred. The field value does not include any leading or trailing 
>>> white space: OWS occurring before the first non-whitespace character of 
>>> the field value or after the last non-whitespace character of the field 
>>> value is ignored and SHOULD be removed without changing the meaning of the 
>>> header field."
>> 
>> 
>> Doesn't read smoothly .... and infact turns into a directive rather than a 
>> permission.
>> 
>> One alternate to illustrate my point ...
>>    "SHOULD be removed" --> "SHOULD be able to"
>> another ... replace the remainder after "the field value is ignored and"
>> with:
>>     "removing OWS before or after the field value SHOULD NOT change the
>>      meaning of the header field."
>
> I think we should just say:
>
> "OWS occurring before the first non-whitespace character of the field value 
> or after the last non-whitespace character of the field value is ignored and 
> can be removed without changing the meaning of the header field."
>
> ...replacing the MAY in draft 07, and the SHOULD in the current edits, by 
> "can". RFC2119 terminology is not needed here.

It seems to me that by not being more explicit, we harm interoperability.

It seems obvious to me with your last proposal, that removing 
excess whitespace before interpreting a value is the only sensible 
interpretation, but I've seen enough sloppy coding to believe, that the 
proposed wording would be interpreted as one can ignore the existance of 
excess white space and use the value, white space and all in some 
comparison. Thus I'm in favor of as strong a wording as we can use at this 
stage of the progression to standard to say that excess white space MUST 
be removed for interpreting the value.

Perhaps introduce the notion of canonical form for headers and values and 
require conversion to canonical form before processing the values in any 
context where the outcome would be different based on the existance of 
extraneous white space. Including (but not limited to) computation of a 
digest of header values, interptreting values, etc.

Dave Morris

Received on Wednesday, 7 October 2009 19:35:44 UTC