- From: Adam Barth <ietf@adambarth.com>
- Date: Wed, 1 Dec 2010 11:50:23 -0800
- To: Mark Nottingham <mnot@mnot.net>
- Cc: HTTP Working Group <ietf-http-wg@w3.org>
On Wed, Dec 1, 2010 at 3:12 AM, Mark Nottingham <mnot@mnot.net> wrote: > Adam, do you have a proposal? Yeah. Please find my proposal below. It's certainly not beautiful, and it likely needs more polish, but it should be a starting point. I tried to be as "gramatical" as I could, but couldn't quite figure out how avoid all the algorithmic aspects. The proposal is based on what Chrome does, but cleaned up slightly. There's some sadness I couldn't quite figure out how to avoid, but I'm certainly open to talking about it more. The rules for determining the disposition-type are particularly goofy. I wanted to do more homework to figure how if we can make those more aesthetic, but I ran out of time. One of the ground rules was that my proposal should only differ from the current draft in error-handling cases. I believe that's the case, but I'm not 100% sure. Please let me know if I've screwed that up. Adam == Extracting Parameter Values From Header Fields == To extract the value for a given parameter-name from an unparsed-string, parse the unparsed-string using the following grammar: unparsed-string = *CHAR name *LWS "=" value [ ";" *CHAR ] value = <CHAR, except ";"> where the name production is a gramatical production that is a case-insensitive match for the given parameter-name. If the unparsed-string can be parsed by the grammar in multple ways, choose the one in which name appears as close to the beginning of the string as possible. If the unparsed-string cannot be parsed by the grammar above, return the empty string. == Decoding the File Name == To filename-decode an encoded-string, parse the encoded-string using the following grammar: encoded-string = word *( 1*delimiter word ) delimiter = LWS word = <CHAR, except delimiter> Consider each gramatical element (either a delimiter or a word) in the order they appear in the encoded-string: 1) If the gramatical element is a delimiter, process the element as follows: a) If the previous gramatical element was an RFC2047-value, ignore this gramatical element. b) Otherwise, emit a SP character. 2) If the gramatical element is a word, process the element as follows: a) If the word contains non-ASCII characters, process the element as follows: i) If the word is a well-formed UTF-8 string, emit the word (decoded as UTF-8) and proceed to the next grammatical element. ii) Otherwise, *sadness*. Apparently what we're supposed to do here is to use the "referrer" charset, if we have one. Otherwse, we fall back to the OS codepage. b) If the word is an RFC2047-value, emit the RFC2047 decoding of the word and proceed to the next grammatical element. c) Let the url-unescaped-word be the word %-unescaped. d) Emit the url-unescaped-word (decoded as UTF-8) and proceed to the next grammatical element. (There's actually more sadness here if the url-unescaped-word isn't valid UTF-8.) The emitted characters are the decoded file name. == Determining the File Name == To determine the file name indicated by a Content-Disposition header field, use the following algorithm: 1) Let filename-star be the value extracted from the Content-Disposition header field for for the "filename*" parameter. 2) If filename-star parses as a RFC5987-value, return the RFC5987-value of filename-star and abort these steps. 3) Let filename be the value extracted from the Content-Disposition header field for the "filename" parameter. 4) If filename is empty, instead let filename be the value extracted from the Content-Disposition header field for the "name" parameter. 5) If filename is empty, return the empty string and abort these steps. 6) Return the filename-decoding of filename. == Determining the Disposition == To determine the disposition-type, parse the Content-Disposition header field using the following grammar: unparsed-string = *LWS nominal-type *CHAR nominal-type = "inline" / "filename" / "name" / ";" If the Content-Disposition header field parser fails to parse, then the disposition type is "attachment". Otherwise, the disposition-type is "inline". == Processing the Content-Disposition Header Field == To process the Content-Disposition header field, use the following algorithm: 1) Determine the disposition-type. 2) If the disposition-type is "inline", then ... 3) If the disposition-type is "attachment", then let filename be the file name indicated by the header field. ...
Received on Wednesday, 1 December 2010 19:51:46 UTC