- From: Roberto Peon <grmocg@gmail.com>
- Date: Sun, 20 Apr 2014 04:27:05 -0700
- To: David Krauss <potswa@gmail.com>
- Cc: Adrian Cole <adrian.f.cole@gmail.com>, HTTP Working Group <ietf-http-wg@w3.org>
- Message-ID: <CAP+FsNd5-VYbbeFb_N0mh+mXqBOzSHHB4mZHCqY45RT0GLO5YQ@mail.gmail.com>
On Sun, Apr 20, 2014 at 3:58 AM, David Krauss <potswa@gmail.com> wrote: > > On 2014–04–20, at 11:59 AM, Roberto Peon <grmocg@gmail.com> wrote: > > On Sat, Apr 19, 2014 at 7:23 PM, David Krauss <potswa@gmail.com> wrote: > >> There’s some circular reasoning here. Interoperability refers to what >> intermediaries may change, or to a lesser extent what synonymous >> bit-codings portable APIs (e.g. Javascript XHR) may merge. If an underlying >> representation may be changed according to the semantics it expresses, then >> relying on bits is not interoperable. >> > > Interoperable implementations of the protocol may not understand each > other at the application layer. That is not a problem the protocol can > solve. > > > I’m talking about whether a proxy is allowed to internally use a generic > END_SEGMENT symbol, or if it must distinguish the case of a bit in an empty > DATA or HEADERS frame following a frame of the other type. In this way the > application layer is creeping into transport. > > Additional questions I still have are whether data may be coalesced across > a headers block, and whether header blocks in the same segment may be > coalesced. I don’t think there are many use-cases for either, and they > could be surprising to applications that forget END_SEGMENT. However, the > first case does seem to be allowed according to the current spec. (It says > nothing about coalescing headers, but then again I can’t find where it > explicitly describes coalescing data either. If headers are not guaranteed > to arrive at any particular point, they could all get pushed to the > beginning or end of the segment, anyway.) > You're ascribing a semantic that I'm not thinking is the semantic of the protocol. The protocol ensures that any END_SEGMENT occurs at the same byte offset from the first byte of a stream as when it was created. Similarly, the protocol ensures that HEADERS are at the same byte offset. > > These are protocol and transport issues. > > A different programmer sensibility is often applied to binary coding than >> to text, but it’s best to use the same approach either way. A format >> defines the expression of a variety of messages, and those messages >> comprise the only defined meaning. >> >> > I suspect we're arguing semantics at such a level at this point that it > doesn't matter, but the protocol cannot define a meaning: It defines a > grammar. > > > Also restrictions and allowances on transporting messages in that grammar, > including some degree of rearrangement. What the sender sees is not always > what the receiver gets. > That depends on how it is defined. As I state above, END_SEGMENT or HEADERS are always at the same byte offset in the datastream. Frames otherwise have zero semantic meaning, as they can be broken up/coalesced at will. > > At this point I suspect we're mostly violently agreeing. > > > Mostly. It comes down to the grammar: > > (HEADERS_WITH_END_SEGMENT | DATA_WITH_END_SEGMENT) > > > Having these two symbols allows for saving 8 bytes, but introduces > possible application design confusion. Application designers need to decide > whether the two symbols should (or should not) mean the same thing, and API > designers whether to support the distinction. Such support actually > requires *three* symbols, with an additional AUTOSELECT_END_SEGMENT to be > used by the sender. > They always mean the same thing. Frame-level details should not be surfaced at the application layer. HTTP2 is not a frame-oriented protocol. It is either message (END_SEGMENT) or bytestream. Frames are subordinate to either of these and form the building blocks for creating streams of messages. > > Given my suspicions about coalescing data across headers, right now I’m > thinking that each HEADERS frame should start a new “message” and all of > segmentation is redundant. Applications that want a sequence of data-only > messages with no metadata can spend 8 bytes on an empty HEADERS frame. > Header-only protocols see no overhead. > No. > > In any case, 8 bytes is only equal to the overhead of the extra DATA frame > that any segmentation implicitly requires, and nothing compared to the > total overhead of flushing which is also likely to happen. We shouldn’t > sacrifice anything for the sake of 8 bytes. > The flag bits are there to be used, there is no sacrifice here :) > > — > > The earlier BNF was imprecise, and it might help the big picture to > definitively record the application-level view. > > The current spec: > > stream: > header-block segment* unterminated-segment? (end-stream|rst-stream) > > segment: > unterminated-segment (headers-end-segment | data-end-segment) > > unterminated-segment: > header-block* data-octet* > (Transport may move the headers relative to the data, such that their > order within a segment is insignificant.) > > Applications should never see frames. They should probably get things like: Got headers on the stream. Got bytes on the stream. Got end of message. Got end of stream. > What we get by fixing the location of all header blocks in the data > stream, sacrificing multiple header blocks within a segment (replaceable by > a user-defined x-begin-message header), and adding 8 bytes per segment > that doesn’t start with headers: > > stream: > segment+ (end-stream|rst-stream) > > segment: > header-block data-octet* > > I think this better matches what application designers expect. I didn’t > include use of END_SEGMENT as an abnormal termination indicator in the list > of sacrifices, because RST_STREAM already does that. > Application designers should never see the frame-level stuff. If they're ascribing semantic value to the frames, they're doing it wrong and their application *will* break as it goes throug a proxy. The application-layer grammar's atoms are: metadata, bytes, end-of-message, end-of-stream. -=R > > It’s also much simpler to correctly specify, and describe usage. > >
Received on Sunday, 20 April 2014 11:27:34 UTC