Re: Comments on draft-ietf-httpbis-content-disp from Mark Nottingham on 2010-11-02 (ietf-http-wg@w3.org from October to December 2010)

From: Mark Nottingham <mnot@mnot.net>
Date: Tue, 2 Nov 2010 11:43:15 +1100
To: Adam Barth <ietf@adambarth.com>
Cc: httpbis <ietf-http-wg@w3.org>
Message-Id: <4E615ED5-7E9B-4A68-8935-32A867933291@mnot.net>
Adam,

Thanks.

At a high level, I'd like to use this discussion to resolve issue #186:
  http://trac.tools.ietf.org/wg/httpbis/trac/ticket/186
in that once we figure out the depth of error-handling that's appropriate in this spec, we should be able to apply that to the HTTP spec overall. 

Some specific thoughts below. Note that I've opened a number of tickets here; please focus follow-up discussion on specific tickets (e.g., calling out the ticket number in the Subject line).


On 01/11/2010, at 7:30 PM, Adam Barth wrote:

> == Disclaimers ==
> 
> 1) I'm aware that there are more implementors of user agents than
> browsers.  I'm not interested in being reminded of that fact.  Browser
> user agents, however, are one important group of user agents.
> 
> 2) I'm aware that this working group does not share my perspective on
> what constitutes a useful specification for user agent implementors.
> I'm not interested in discussing whether the level of precision I'm
> requesting is valuable.
> 
> 3) I'm aware that this document reflects business-as-usual in the
> IETF.  My position is that business-as-usual does not meet the needs
> of browser user agent implementors, largely because browser user agent
> implements have been effectively absent from the IETF process for the
> better part of a decade.

As Bjoern has noted, this isn't the best way to start a discussion, but here we are. We will most certainly discuss the implications of any changes on non-browser UAs, and will also discuss the appropriate level of precision to specify. You can choose to participate in those discussions or not, of course.

I'll leave your statements about the IETF as a whole to stand on their own merits.


> == Comments ==
> 
> The comments below are relative to
> http://tools.ietf.org/html/draft-ietf-httpbis-content-disp-03, which
> is the most recent version I could find on the IETF web site.

Yes; that's the version that is under WGLC.


> http://tools.ietf.org/html/draft-ietf-httpbis-content-disp-03#section-3.1
> 
> This section defines a grammar for the Content-Disposition header
> field.  However, the document does not define how a user agent should
> interpret Content-Disposition header fields that do not conform to
> this grammar.  To foster interoperability between user agent
> implementations, the document should define how user agents are to
> process every sequence of bytes they could receive in a
> Content-Disposition header field.
> 
> => Parameter names MUST NOT be repeated.
> 
> The document should not phrase normative requirements in the passive
> voice.  Instead, the document should make clear which protocol
> partipants are bound by each requirement.  For example, this
> requirement probably should read "servers MUST NOT generate
> Content-Disposition header field values with multiple instances of the
> same parameter name."

I think this is largely editorial feedback; it probably isn't appropriate to say 'servers..' but something like

Senders MUST NOT generate C-D header field values with multiple instances of the same parameter name.

Ticket: 
  http://trac.tools.ietf.org/wg/httpbis/trac/ticket/258


> => a header field value with multiple instances of the same parameter
> SHOULD be treated as invalid.
> 
> Similarly, this requirement probably should read "user agents SHOULD
> treat a header field value with multiple instances of the same
> paramater as invalid."  Furthermore, the document should define what
> treating a header field value as invalid means.  Presumably the author
> intends that user agents ought to ignore such header field values.
> I'm skeptical that is the optimum behavior for user agents.  I would
> have expected user agents to either use the first or the last instance
> of each paramater.

Ticket:
  http://trac.tools.ietf.org/wg/httpbis/trac/ticket/259

Note that it may be resolved by indicating that 'treat as invalid' is specific to the application at hand. As such, I'd like initial discussion of this in the WG to focus on:
 a) use cases: how different implementations / applications may want to have different notions of 'invalid' (or not), and
 b) security: what the security impact of having different notions of 'invalid' here may be, and
 c) interoperability: likewise, the interop impact.


> http://tools.ietf.org/html/draft-ietf-httpbis-content-disp-03#section-3.2
> 
> This section does not define how user agents ought to process header
> field values with multiple disposition types.  According to this test
> case <http://greenbytes.de/tech/tc2231/#attandinline2>, user agents
> MUST use the first disposition type.

Ticket:
  http://trac.tools.ietf.org/wg/httpbis/trac/ticket/260


> http://tools.ietf.org/html/draft-ietf-httpbis-content-disp-03#section-3.3
> 
> This section provides very little guidance about how to extract a file
> name from the filename parameter.  For example, it fails to instruct
> the user agent about how to handle the following test cases:
> 
> http://greenbytes.de/tech/tc2231/#attwithasciifnescapedquote
> http://greenbytes.de/tech/tc2231/#attwithasciifilenamenqws
> http://greenbytes.de/tech/tc2231/#attwithutf8fnplain
> http://greenbytes.de/tech/tc2231/#attwithfnrawpctenca
> http://greenbytes.de/tech/tc2231/#attwith2filenames
> http://greenbytes.de/tech/tc2231/#attfnbrokentoken
> http://greenbytes.de/tech/tc2231/#attbrokenquotedfn

Ticket:
  http://trac.tools.ietf.org/wg/httpbis/trac/ticket/261


> In particular, this document should define an algorithm that takes as
> input a sequence of bytes obtained by parsing the Content-Disposition
> header field value and returns a sequence of characters which is the
> file name requested by the server.

I'm treating that as editorial advice.


> The document defines filename* by referring to RFC5987, but RFC5987
> does not define a precise algorithm for computing the file name from a
> sequence of input bytes.

If you have specific issues (rather than just a general desire for an algorithm), please raise them (e.g., as in #261, although even more specificity would be appreciated).


> Jungshik Shin writes:
> 
> [[
> As for RFC 5987, I'm aware that it's a profile of RFC 2231 (it's good
> that it's simpler than the full RFC 2231), but I wrote that it's
> unnecessarily 'complex' and not many web servers would adopt that
> anytime soon. That's why I advocated a much simpler approach of using
> (percent-encoded) UTF-8. I'm aware that it has its own share of
> issues, but I suspect that it's got a better chance of being adopted
> by web servers.
> ]]
> 
> I agree with his assessment.  We should simply use percent-encoded
> UTF-8 instead of letting the server specify whatever crazy encoding it
> dreams up.

The 'crazy encoding' you refer to isn't dreamed up by the Web server. Regardless, if you'd like to pursue this path, you need to make a proposal and do the legwork to show that implementers will support it, with a reasonable backwards-compatibility story.


> Also, we should remove the language tagging facility
> because it is gratuitous.

Can you say a bit more here? We can open an issue for this, but your reasoning (beyond "it's gratuitous") isn't clear.


> http://tools.ietf.org/html/draft-ietf-httpbis-content-disp-03#appendix-C.2
> 
> As far as I can tell, this is actually the biggest interoperability
> problem with the Content-Disposition header field.  Unfortunately,
> this document does nothing to resolve this issue.  I recommend that
> this document take a position with respect to how to handle
> percent-encoded values in the filename parameter.  Specifically, I
> recommend that the document instruct user agents to decode percent
> encoded values using the user's preferred encoding.  Yes, that's ugly,
> but it's the way Content-Disposition works in the real world and the
> most likely requirement to actually be implemented by user agents.

Could you expand upon this a bit more? E.g., are you saying that after the 5987 encoding is removed, the resulting string should be percent-decoded? Or that the filename (no *) parameter should be percent-decoded? Both?

Raising as a placeholder:
  http://trac.tools.ietf.org/wg/httpbis/trac/ticket/262

I suspect that this issue is going to be similar to #259; i.e., the answer may be different for different implementers. As such, we should try to figure out if that's the case (and why) first.


> In short, this document does not address the needs of browser user
> agent implementors.  This objection can be resolved in two ways:

This isn't the W3C, we don't have objections. Make arguments and they'll be judged on technical merit, based upon the rough consensus of implementers and their running code. 

Cheers,


--
Mark Nottingham   http://www.mnot.net/
Received on Tuesday, 2 November 2010 00:43:49 UTC