New Proposed Chunked Rules

  In addtion to the change to chunk length syntax, we discussed in
  Memphis clarifying the rules on what 'headers' are valid in the
  footer of a chunk encoded response.  Merging these two (because the
  affect the same paragraphs in the spec), I propose that the
  following text replace a portion of section 3.6 beginning with the
  chunked encoding syntax up to but not including the paragraph which
  begins "An example process for":

  ================================================================

       Chunked-Body   = *chunk
                        last-chunk
                        footer
                        CRLF

       chunk          = chunk-size [ chunk-ext ] CRLF
                        chunk-data CRLF

       chunk-size     = 1*HEX

       last-chunk     = 1*("0") [ chunk-ext ] CRLF

       chunk-ext      = *( ";" chunk-ext-name [ "=" chunk-ext-value ] )
       chunk-ext-name = token
       chunk-ext-val  = token | quoted-string
       chunk-data     = chunk-size(OCTET)

       footer         = *entity-header

  The chunked encoding is ended by any chunk whose size is zero,
  followed by the footer, which is terminated by an empty line.  The
  purpose of the footer is to provide an efficient way to supply
  information about an entity that is generated dynamically.
  Applications MUST NOT send header fields in the footer which are not
  explicitly defined as being appropriate for the footer.

  The Content-MD5 header (section 14.16) is appropriate for the footer.

  The Authentication-Info header defined by RFC 2069 (An Extension to
  HTTP: Digest Access Authentication), or its successor is appropriate
  for the footer.

  ================================================================

  The modified syntax above also allows a chunk-ext on the zero length
  last-chunk, which I added for symmetry, and for the length of the
  final chunk to be multiple digits as long as they are all zero.

  There has been some discussion of whether or not LWS is allowed
  in the chunk length line.  I believe that the following, taken from
  section 2.1 of the current spec, specifically allows LWS:

    implied *LWS
         The grammar described by this specification is word-based. Except
         where noted otherwise, linear whitespace (LWS) can be included
         between any two adjacent words (token or quoted-string), and
         between adjacent tokens and delimiters (tspecials), without
         changing the interpretation of a field. At least one delimiter
         (tspecials) must exist between any two tokens, since they would
         otherwise be interpreted as a single token.

  My reading is that all of the following would be valid:

11CRLF
12 CRLF
13;foo=barCRLF
14;foo=bar CRLF
15; foo=barCRLF
16; foo=bar CRLF
17 ; foo=barCRLF
18 ; foo=bar CRLF

  I do not believe that the rule covers (or needs to cover) LWS
  preceeding the length (that is not a boundary between a word and
  either another word or a tspecial), especially now that we allow
  leading zeros.

--
Scott Lawrence                                       <lawrence@agranat.com>
Agranat Systems, Inc.                               http://www.agranat.com/

Received on Monday, 14 April 1997 09:24:36 UTC