- From: Julian Reschke <julian.reschke@greenbytes.de>
- Date: Sat, 10 Jan 2015 17:58:46 +0100
- To: ietf@ietf.org
- CC: ietf-http-wg@w3.org
Hi there, these comments are mostly editorial; the main exception being the contents of the predefined static table. Best regards, Julian 1. Introduction In HTTP/1.1 (see [RFC7230]), header fields are not compressed. As Web pages have grown to include dozens to hundreds of requests, the s/include/cause/ (the page doesn't really include requests) 1.1. Overview The format defined in this specification treats a list of header fields as an ordered collection of name-value pairs that can include duplicates. Names and values are considered to be opaque sequences of octets, and the order of header fields is preserved after being compressed and decompressed. When we say duplicate here, it's about duplicate *names*, right? 2.1. Header List Ordering HPACK preserves the ordering of header fields inside the header list. An encoder MUST order header field representations in the header block according to their ordering in the original header list. A decoder MUST order header fields in the decoded header list according to their ordering in the header block. Just before we said that we do not define the encoder. Maybe reprase the first MUST as a statement of fact. 2.3.2. Dynamic Table The dynamic table can contain duplicate entries. Therefore, duplicate entries MUST NOT be treated as an error by a decoder. What's the definition of "duplicate" here? Name and value match? 3.2. Header Field Representation Processing The processing of a header block to obtain a header list is defined in this section. To ensure that the decoding will successfully produce a header list, a decoder MUST obey the following rules. "This section defines..." 4.1. Calculating Table Size The size of the dynamic table is the sum of the size of its entries. The size of an entry is the sum of its name's length in octets (as defined in Section 5.2), its value's length in octets (see Section 5.2), plus 32. Remove one of the references to Section 5.2. The size of an entry is calculated using the length of the name and value without any Huffman encoding applied. This is the first time that Huffman is mentioned; it would probably be good to mention it earlier on. NOTE: The additional 32 octets account for the overhead associated with an entry. For example, an entry structure using two 64-bit pointers to reference the name and the value of the entry, and two 64-bit integers for counting the number of references to the name and value would have 32 octets of overhead. maybe say "estimated overhead"? 5.1. Integer Representation Integers are used to represent name indexes, pair indexes or string lengths. To allow for optimized processing, an integer representation always finishes at the end of an octet. First use of "pair index"; at this point it's not clear what that might be. Pseudo-code to represent an integer I is as follows: if I < 2^N - 1, encode I on N bits else encode (2^N - 1) on N bits I = I - (2^N - 1) while I >= 128 encode (I % 128 + 128) on 8 bits I = I / 128 encode I on 8 bits Pseudo-code to decode an integer I is as follows: decode I from the next N bits if I < 2^N - 1, return I else M = 0 repeat B = next octet I = I + (B & 127) * 2^M M = M + 7 while B & 128 == 128 return I We have bad experience with making indentation in artwork significant; maybe introducing brackets would be a good idea. This integer representation allows for values of indefinite size. It is also possible for an encoder to send a large number of zero values, which can waste octets and could be used to overflow integer values. Excessively large integer encodings - in value or octet length - MUST be treated as a decoding error. Different limits can be set for each of the different uses of integers, based on implementation constraints. Having a MUST here when we don't say what "excessive" is seems like a bad idea. 5.2. String Literal Representation Header field names and header field values can be represented as literal string. A literal string is encoded as a sequence of octets, either by directly encoding the literal string's octets, or by using a Huffman code (see [HUFFMAN]). 0 1 2 3 4 5 6 7 +---+---+---+---+---+---+---+---+ | H | String Length (7+) | +---+---------------------------+ | String Data (Length octets) | +-------------------------------+ Figure 4: String Literal Representation A literal string representation contains the following fields: H: A one bit flag, H, indicating whether or not the octets of the string are Huffman encoded. String Length: The number of octets used to encode the string literal, encoded as an integer with 7-bit prefix (see Section 5.1). String Data: The encoded data of the string literal. If H is '0', then the encoded data is the raw octets of the string literal. If H is '1', then the encoded data is the Huffman encoding of the string literal. The figure could be interpreted as if the string length is always expressed in 7 bits, which I believe is not true. 6.2.1. Literal Header Field with Incremental Indexing A literal header field with incremental indexing representation results in appending a header field to the decoded header list and inserting it as a new entry into the dynamic table. 0 1 2 3 4 5 6 7 +---+---+---+---+---+---+---+---+ | 0 | 1 | Index (6+) | +---+---+-----------------------+ | H | Value Length (7+) | +---+---------------------------+ | Value String (Length octets) | +-------------------------------+ Figure 6: Literal Header Field with Incremental Indexing - Indexed Name See above. One can read this as "H" being bit 0 in the second octet; but that's not always the case. I think it would be good to revise the figure format. 6.3. Dynamic Table Size Update The new maximum size MUST be lower than or equal to the last value of the SETTINGS_HEADER_TABLE_SIZE parameter (see Section 6.5.2 of [HTTP2]) received from the decoder and acknowledged by the encoder (see Section 6.5.3 of [HTTP2]). Some parts of the spec read as if HPACK doesn't have direct dependency on HTTP/2 or HTTP in general. However, here a normative dependency is introduced. 7.1. Probing Dynamic Table State This is possible even over TLS, because while TLS provides confidentiality protection for content, it only provides a limited amount of protection for the length of that content. Expand "TLS" on first use, and add a reference. 9. References 9.1. Normative References [HTTP2] Belshe, M., Peon, R., and M. Thomson, Ed., "Hypertext Transfer Protocol version 2", draft-ietf-httpbis-http2-16 (work in progress), October 2014. November (already fixed in Git). 9.2. Informative References [CRIME] Rizzo, J. and T. Duong, "The CRIME Attack", September 2012, <https://docs.google.com/a/twist.com/ presentation/d/ 11eBmGiHbYcHR9gL5nDyZChu_-lCa2GizeuOfaLU2HOU/ edit#slide=id.g1eb6c1b5_3_6>. The fragment identifier on the URI appears to take as to the last slide; maybe drop it? [HUFFMAN] Huffman, D., "A Method for the Construction of Minimum Redundancy Codes", Proceedings of the Institute of Radio Engineers Volume 40, Number 9, pp. 1098-1101, September 1952, <https://ieeexplore.ieee.org/xpl/ articleDetails.jsp?arnumber=4051119>. That URI leads me to a broken page due to mixed content problems. The HTTP version works fine. [SPDY] Belshe, M. and R. Peon, "SPDY Protocol", draft-mbelshe-httpbis-spdy-00 (work in progress), February 2012. It's not work in progress, and we should have a URI for it. Yes, that's something the RFC Editor will need to figure out :-). Appendix A. Static Table Definition The static table (see Section 2.3.1) consists of a predefined and unchangeable list of header fields. The static table was created by listing the most common header fields that are valid for messages exchanged inside a HTTP/2 connection. Is that really true? How come we have Proxy-Auth* in it then, and also an unregistered header field ("refresh")? I believe we need to document how we came up with this list. | 16 | accept-encoding | gzip, deflate | Appendix B. Huffman Code As an example, the code for the symbol 47 (corresponding to the ASCII character "/") consists in the 6 bits "0", "1", "1", "0", "0", "0". "consists of"? This corresponds to the value 0x18 (in hexadecimal) encoded on 6 bits. "encoded in"? C.3. It would be cool to have an example with repeating header fields (maybe two instances of Cache-Control)?
Received on Saturday, 10 January 2015 16:59:19 UTC