Re: TICKET 259: 'treat as invalid' not defined

On 9/11/2010 3:13 p.m., Mark Nottingham wrote:
> My inclination here is to (as with the others) punt the issue of actually defining 'invalid' in the 'core' specification; it can be either an appendix or a separate document.
> However, I'm wondering if we can at least identify those places where errors may occur consistently; e.g., by saying things like:
> "Header values with multiple instances are considered invalid."
> "Header values that do not match the ABNF are considered invalid."
> "..."
> and then something like
> "A header value that is considered invalid MAY be ignored, or MAY have error handling algorithms applied to attempt to recover the intent of the message."
> Thoughts?

On the topic about suggesting invalid headers be ignored.

Typically ignoring invalid headers is actually a recovery strategy, 
since the goal is to attempt to proceed to process the message.  Not 
ignoring the error would be the case where the message is rejected somehow.

Some invalid headers are not safe to ignore.  E.g. Content-Length, 
Transfer-Encoding, Content-Encoding and maybe some others.

So, maybe we need 2 classes of header.  One where we feel it's safe to 
ignore errors (act as if the header were absent), and others where we 
feel it's not safe.  I don't think one blanket rule for all really works.

Maybe even differentiate between errors in say parameters (e.g. q 
values) vs the rest of the header.

Or did you use ignored in the sense of "treat as if the header were 

In which case in the case of Content-Length, there's defined processes 
for how to deal with lack of that.  My gut feel is this particular one 
would possibly lead to smuggling issues, so we would be back to maybe 2 
classes of header (ignorable and not).

For intermediaries it might help to have some language about headers 
believed to be in error (in respect of forwarding).

E.g. should a proxy:

* strip
* forward as is (leave it up to the next agent)
* attempt to correct (I wouldn't recommend this)
* reject



> On 02/11/2010, at 1:56 PM, Adam Barth wrote:
>> On Mon, Nov 1, 2010 at 5:43 PM, Mark Nottingham<>  wrote:
>>>> =>  a header field value with multiple instances of the same parameter
>>>> SHOULD be treated as invalid.
>>>> Similarly, this requirement probably should read "user agents SHOULD
>>>> treat a header field value with multiple instances of the same
>>>> paramater as invalid."  Furthermore, the document should define what
>>>> treating a header field value as invalid means.  Presumably the author
>>>> intends that user agents ought to ignore such header field values.
>>>> I'm skeptical that is the optimum behavior for user agents.  I would
>>>> have expected user agents to either use the first or the last instance
>>>> of each paramater.
>>> Ticket:
>>> Note that it may be resolved by indicating that 'treat as invalid' is specific to the application at hand. As such, I'd like initial discussion of this in the WG to focus on:
>>>   a) use cases: how different implementations / applications may want to have different notions of 'invalid' (or not), and
>> The browser use case proceeds from the following premises.
>> 1) Many servers send invalid messages to user agents.
>> 2) Many existing user agents have some behavior for these invalid messages.
>> 3) If a new user agent wishes to compete in the market, that user
>> agent needs to handle the invalid messages in the same way as the
>> existing user agents.
>> Use case: Users benefit when there is competition among browser
>> vendors.  Without specifying how to handle invalid messages, new user
>> agents need to reverse engineer the behavior of existing user agents,
>> making it more difficult to compete in the marketplace.
>>>   b) security: what the security impact of having different notions of 'invalid' here may be, and
>>>   c) interoperability: likewise, the interop impact.
>> The largest interoperability problems here are the different user
>> agents handle invalid messages differently.  For example, today, it's
>> unpredictable how user agents will interpret a filename parameter with
>> a % character.  That part of the protocol is less useful to servers
>> because the results are not predictable.
>> Adam
> --
> Mark Nottingham

Received on Tuesday, 9 November 2010 04:03:52 UTC