Re: [#95] Multiple Content-Lengths from Eric J. Bowman on 2010-10-15 (ietf-http-wg@w3.org from October to December 2010)

From: Eric J. Bowman <eric@bisonsystems.net>
Date: Fri, 15 Oct 2010 01:46:11 -0600
To: Maciej Stachowiak <mjs@apple.com>
Cc: Julian Reschke <julian.reschke@gmx.de>, Adam Barth <w3c@adambarth.com>, " William Chan (陈智昌) " <willchan@chromium.org>, "Roy T. Fielding" <fielding@gbiv.com>, Mark Nottingham <mnot@mnot.net>, HTTP Working Group <ietf-http-wg@w3.org>
Message-Id: <20101015014611.6fd69805.eric@bisonsystems.net>
Maciej Stachowiak wrote:
>
> >> 
> >> Is there a reason for this to be a SHOULD instead of a MUST? I know
> >> Adam already asked that, but I don't recall seeing an answer.
> >> 
> > 
> > While MUST would be correct for a browser-conformance
> > specification, it isn't in scope for HTTP to dictate what
> > user-agents do with responses, only how to interpret them.
> 
> Requiring clients to close the connection seems in scope to me.
> Though to be fair, the requirement for the very explicit "Connection:
> close" directive is only SHOULD-level.
> 

Exactly my point.  HTTPbis won't be finished for decades if every last
aspect of it must be rewritten to detail how all user agents should
behave, based on the requirements of one specific class of user agent.

I don't have a problem with browser vendors collaborating on a
specification for how browsers, specifically, should behave in specific
situations.  But such a spec should be supplemental to HTTP, not part
of it.

>
> > If I'm debugging a server using curl, I
> > want to see duplicate or incorrect Content-Length headers instead of
> > total failure.
> 
> Debugging is a special case. I would not be inclined to wrap the
> whole spec around it.
> 

How does the explanation of reading to the end of the content to
determine length, while also stating that's an error, "wrap the whole
spec around" debugging?  I gave one concrete example of a use case
which precludes a MUST... how many more do I need to provide, in order
to make my point that HTTP shouldn't be wrapped around the specific
needs of *any* class of user agent?

>
> That being said, it seems like a debugging tool would be entitled to
> report the error, including multiple Content-Lengths, but it would
> also be required to close the connection immediately and to not
> present the message as if it were a normal valid response.
> 

Well, that's a SHOULD.  Have you ever seen a descriptive error message
from curl in such a case?  Kinda pointless for a tool meant to show
exactly what the server output is, because the error in one header
could be the result of a missing cr/lf in the previous header, etc...
I'd rather see the results of the code I'm working on than have to
decipher exactly what caused the user agent error I'm seeing.

But, any debate about the debugging example is pointless.  The cardinal
reason that HTTP doesn't dictate user agent behavior, is because there
are more use cases than could possibly be accounted for.  The protocol
cannot base itself on what's best for a single class of user agent.  If
that isn't good enough for any class of user agent, the developers of
user agents of that class are free to create a supplementary spec where
their precise needs *are* in-scope.

> 
> > This isn't a security problem in my development environment, access
> > to which isn't possible over the Internet, so I'd rather be able to
> > debug without resorting to wireshark because all user-agents are
> > required to be as paranoid as a desktop browser.  My preferred
> > wording:
> > 
> > "If this is a response message received by a user-agent, the
> > message- body length MAY be determined by reading the connection
> > until it is closed; an error SHOULD be indicated to the user."
> 
> Phrasing it like that sounds like a recipe for interoperability
> failure. Furthermore, it doesn't even list the option that we seem to
> have rough consensus is the best behavior for browsers.
> 

If 99.999% of implementations get it correct, then this entire phrasing
is irrelevant to interoperability.  In general, my gripe is that
HTTPbis will never be finished if every speck of wording is going to go
through the consensus grinder based on .001% edge cases based on broken
implementations which can't be *expected* to interoperate, let alone in
a secure fashion.

> 
> > I understand the security problem, but something which in reality
> > occurs in an undetermined subset of responses (different Content-
> > Lengths) which altogether only account for .001% of traffic
> > (duplicate Content-Lengths) is not justification for SHOULD discard
> > -- which would downgrade debugging tools like curl to being only
> > conditionally conformant, an outcome which just isn't justified.
> 
> The security risk is not proportional to the frequency of the problem
> on the live internet. However, the need for debugging is proportional
> to the frequency. So if the problem is very rare, the tradeoffs
> should be in favor of security. (That's assuming the security risk is
> indeed serious; I do not know enough about the relevant attacks to
> judge).
> 

Whatever security risk there is, comes from not following the standard.
That's to be expected in the general case.  It really isn't necessary
for that standard to go into detail about how to handle every specific
case of nonconformant syntax found in the wild.  I don't need to be
told that nonconformant syntax is an undefined security issue, as it
goes without saying, and may not be relevant to what I'm doing anyway,
so please don't impose any paranoid requirements on all user agents
based on FUD and extreme edge cases.  Put those in your supplemental
browser-conformance spec, they don't need to be in HTTP because desktop/
mobile browsing isn't the only use case of HTTP.

-Eric
Received on Friday, 15 October 2010 07:46:44 UTC