How to handle content-encoding

I'm thinking through how to add support for Content-Encoding to lua-http
https://github.com/daurnimator/lua-http/issues/22

A brief digression to lua-http structure (library terminology is borrowed
from http2):
  - a 'connection' encapsulates a socket, a connection has many streams
  - a 'stream' is a request/response pair (a request can have multiple
header blocks, and many data chunks)
      - The same stream structure is used for both client and server
      - You can implement a HTTP proxy by forwarding items from one stream
to another
  - a 'request' is a pre-prepared object consisting of a request header
block, a function to obtain body chunks, and a destination.
      - `request:go()` returns the 'main' response header block and a
stream (from which you can read the body one chunk at a time)

There is a desire to compress content to save bandwidth, HTTP has had two
main ways to do this: Transfer-Encoding and Content-Encoding.

To me it was simple to add support for Transfer-Encoding, without any
ambiguities or issues. For HTTP1 in the stream logic:
  -  (if zlib is installed) we automatically add `TE: gzip, deflate`.
  - On reply, if Transfer-Encoding contains gzip or deflate, we decode it
before passing it onto the caller.
This is permitted as TE and Transfer-Encoding are hop-by-hop headers.

However, HTTP2 does not support transfer-encoding.
Furthermore, certain servers **stares at twitter.com** send
`Content-Encoding: gzip` even if you *don't* send `Accept-Encoding: gzip`
This seems to demand that I support Content-Encoding.

As far as the specifications go, Content-Encoding is *meant* to be used to
for end-to-end encoding that intermediate hops do not touch.
  - Intermediaries should cache Content-Encoded bodies in their encoded form
  - ETag is dependant on Content-Encoding

This makes it hard to find a place for it in lua-http's structure.
If I add it transparently in the stream (as done for Transfer-Encoding)
then it will be hop-by-hop (not end-to-end)
This seems to demand (at least for client requests) that it is switched
on/off at the request layer.
>From there though, it seems it would need to add some sort of stream body
filter?

How should I be adding this? What have other implementations done? (and
what do they wish they'd done differently?)
The current state seems to be *against* the spec: should the spec be
changed? should implementations be updated?
HTTP2 has no transfer-encoding equivalent... why not?

Regards,
Daurn.


Links:
  - https://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.11
Original content-encoding spec
  - https://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html#sec13.5.1
Hop-by-hop headers
  - https://tools.ietf.org/html/rfc7231#section-3.1.2.1 Current spec
  - https://bugzilla.mozilla.org/show_bug.cgi?id=68517 Mozilla disregards
Content-Encoding spec
  -
https://stackoverflow.com/questions/11641923/transfer-encoding-gzip-vs-content-encoding-gzip
  - https://daurnimator.github.io/lua-http/ lua-http documentation

Received on Tuesday, 31 May 2016 02:48:09 UTC