Re: header parsing, trailing OWS

I think it must always be possible for an agent to remove the OWS 
without breaking the meaning of the header.  We can't really allow a 
situation where the OWS has some meaning, or is relied on, since that 
then breaks all manner of things (e.g. such as comparison of selecting 
headers for caches).

If we take that to its logical conclusion there should be

a) a MUST level requirement for implementations to not generate 
extraneous OWS
b) a SHOULD level requirement for intermediaries to remove extraneous OWS
c) a requirement for caches and clients to remove OWS when using headers 
- hard to tell whether should be SHOULD or MUST.

regards

Adrien


David Morris wrote:
>
>
> On Wed, 7 Oct 2009, Julian Reschke wrote:
>> David Morris wrote:
>>> On Thu, 24 Sep 2009, Julian Reschke wrote:
>>>
>>>> In the current edits, the last 'MAY' is a 'SHOULD', which makes it 
>>>> read
>>>>
>>>> "A field value MAY be preceded by optional whitespace (OWS); a 
>>>> single SP is preferred. The field value does not include any 
>>>> leading or trailing white space: OWS occurring before the first 
>>>> non-whitespace character of the field value or after the last 
>>>> non-whitespace character of the field value is ignored and SHOULD 
>>>> be removed without changing the meaning of the header field."
>>>
>>>
>>> Doesn't read smoothly .... and infact turns into a directive rather 
>>> than a permission.
>>>
>>> One alternate to illustrate my point ...
>>>    "SHOULD be removed" --> "SHOULD be able to"
>>> another ... replace the remainder after "the field value is ignored 
>>> and"
>>> with:
>>>     "removing OWS before or after the field value SHOULD NOT change the
>>>      meaning of the header field."
>>
>> I think we should just say:
>>
>> "OWS occurring before the first non-whitespace character of the field 
>> value or after the last non-whitespace character of the field value 
>> is ignored and can be removed without changing the meaning of the 
>> header field."
>>
>> ...replacing the MAY in draft 07, and the SHOULD in the current 
>> edits, by "can". RFC2119 terminology is not needed here.
>
> It seems to me that by not being more explicit, we harm interoperability.
>
> It seems obvious to me with your last proposal, that removing excess 
> whitespace before interpreting a value is the only sensible 
> interpretation, but I've seen enough sloppy coding to believe, that 
> the proposed wording would be interpreted as one can ignore the 
> existance of excess white space and use the value, white space and all 
> in some comparison. Thus I'm in favor of as strong a wording as we can 
> use at this stage of the progression to standard to say that excess 
> white space MUST be removed for interpreting the value.
>
> Perhaps introduce the notion of canonical form for headers and values 
> and require conversion to canonical form before processing the values 
> in any context where the outcome would be different based on the 
> existance of extraneous white space. Including (but not limited to) 
> computation of a digest of header values, interptreting values, etc.
>
> Dave Morris
>

-- 
Adrien de Croy - WinGate Proxy Server - http://www.wingate.com

Received on Wednesday, 7 October 2009 21:37:29 UTC