Re: More SPDY Related Questions.. from Roberto Peon on 2012-07-21 (ietf-http-wg@w3.org from July to September 2012)

From: Roberto Peon <grmocg@gmail.com>
Date: Sat, 21 Jul 2012 12:41:50 -0700
To: James M Snell <jasnell@gmail.com>
Cc: ietf-http-wg@w3.org
Message-ID: <CAP+FsNen8pnp0Ph8HnEgudtgbavLM88TtweBacurFoNx2v5o=g@mail.gmail.com>
On Jul 21, 2012 12:10 PM, "James M Snell" <jasnell@gmail.com> wrote:
>
> Continuing my review of the SPDY draft... have a few questions relating
to SPDY and load balancers / reverse proxy set ups... The intent is not to
poke holes but to understand what the SPDY authors had in mind for these
scenarios...

Poke away!

>
> 1. Imagine two client applications (A and B) accessing an Origin (D) via
a Reverse Proxy (C). When a client accesses /index.html on Origin D, the
Origin automatically pushes static resources /foo.css, /images/a.jpg and
/video/a.mpg to the client.
>
> Basic flow looks something like...
>
> A                  RP                 O
> |                   |                 |
> |                   |                 |
> |==================>|                 |
> | 1)SYN             |                 |
> |<==================|                 |
> | 2)SYN_ACK         |                 |
> |==================>|                 |
> | 3)ACK             |                 |
> |==================>|                 |
> | 4)SYN_STREAM (1)  |                 |
> |                   |================>|
> |                   | 5) SYN          |
> |                   |<================|
> |                   | 6) SYN_ACK      |
> |                   |================>|
> |                   | 7) ACK          |
> |                   |================>|
> |                   | 8) SYN_STREAM(1)|
> |                   |<================|--
> |                   | 9) SYN_STREAM(2)| |
> |                   |  uni=true       | |
> |<==================|                 | |
> | 10) SYN_STREAM(2) |                 | |
> |  uni=true         |                 | | Content Push
> |                   |<================| |
> |                   | 11) SYN_REPLY(1)| |
> |<==================|                 | |
> | 12) SYN_REPLY(1)  |                 | |
> |                   |                 | |
> |                   |<================| |
> |<==================| 13) DATA (2,fin)|--
> | 14) DATA (2,fin)  |                 |
> |                   |                 |
> |                   |                 |
>
> My question is: what does this picture look like if Client's A and B
concurrently request /index.html?
>
> With HTTP/1.1, static resources can be pushed off to CDN's, stored in
caches, distributed around any number of places in order to improve overall
performance. Suppose /index.html is cached at the RP. Is the RP expected to
also cache the pushed content?

It could if the pushed content indicated it was cacheable using http/1.2
caching semantics/headers.

> Is the RP expected to keep track of the fact that /foo.css, images/a.jpg
and /video/a.mpg were pushed before and push those automatically from it's
own cache when it returns the cached instance of /index.html?

No, but a smart implementation would, cache headers allowing...

> If not, when the caching proxy returns index.html from it's cache, A and
B will be forced to issue GETs for the static resources defeating the
purpose of pushing those resources in the first place.
>
> In theory, we could introduce new Link rels in the same spirit as
http://tools.ietf.org/html/draft-nottingham-linked-cache-inv-03 that tell
caches when to push cached content... e.g.

Yup

>
> SYN_STREAM
>   id=2
>   unidirectional=true
>   Content-Location: http://example.org/images/foo.jpg
>   Content-Type: image/jpeg
>   Cache-Control: public
>   Link: </index.html>; rel="cache-push-with"
>
> What does cache validation look like for pushed content? E.g. what
happens if the cached /index.html is fresh and served from the cache but
the related pushed content also contained in the cache is stale?

Same as any normal cache expiry.

>
> I'm sure I can come up with many more questions, but it would appear to
me that server push in SPDY is, at least currently, fundamentally
incompatible with existing intermediate HTTP caches and RP's, which is
definitely a major concern.

Given that the headers on the pushed objects indicate cacheability I'm
either misunderstanding or missing something :)

>
> As a side note, however, it does open up the possibility for a new type
of proxy that can be configured to automatically push static content on the
Origin's behalf... e.g. A SPDY Proxy that talks to a backend HTTP/1.1
server and learns that /images/foo.jpg is always served with /index.html so
automatically pushes it to the client. Such services would be beneficial in
general, but the apparent incompatibility with existing deployed
infrastructure is likely to significantly delay adoption. Unless, of
course, I'm missing something fundamental :-)

Jetty has done this already publically, btw.

>
> 2. While we on the subject of Reverse Proxies... the SPDY spec currently
states:
>
>    When a SYN_STREAM and HEADERS frame which contains an
>    Associated-To-Stream-ID is received, the client must
>    not issue GET requests for the resource in the pushed
>    stream, and instead wait for the pushed stream to arrive.
>
>    Question is: Does this restriction apply to intermediaries like
Reverse Proxies?

Yup. RPs are both servers and clients (depending on the side one is on) and
the respective requirement are still in force.

> For instance, suppose the server is currently pushing a rather large
resource to client A and Client B comes along and sends a GET request for
that specific resource. Assume that the RP ends up routing both requests to
the same backend Origin server. A strict reading of the above requirement
means that the RP is required to block Client B's get request until the
push to Client A is completed.

Only per connection. The intention and purpose of this requirement is to
eliminate the potential race whereby the server is pushing an object, but
the client is requesting it as well (because it doesn't yet know the object
is already on its way).

> Further, the spec is not clear if this restriction only applies for
requests sent over the same TCP connection. Meaning, a strict reading of
this requirement means that even if the RP opens a second connection to the
Origin server, it is still forbidden to forward Client B's GET request
until Client A's push has been completed.

We should clarify that the invariant here exists to prevent the race
mentioned above and any behavior is allowed so long as that race is
prevented.

-=R

>
>
> - James
>
>
 On Jul 21, 2012 12:10 PM, "James M Snell" <jasnell@gmail.com> wrote:

> Continuing my review of the SPDY draft... have a few questions relating to
> SPDY and load balancers / reverse proxy set ups... The intent is not to
> poke holes but to understand what the SPDY authors had in mind for these
> scenarios...
>
> 1. Imagine two client applications (A and B) accessing an Origin (D) via a
> Reverse Proxy (C). When a client accesses /index.html on Origin D, the
> Origin automatically pushes static resources /foo.css, /images/a.jpg and
> /video/a.mpg to the client.
>
> Basic flow looks something like...
>
> A                  RP                 O
> |                   |                 |
> |                   |                 |
> |==================>|                 |
> | 1)SYN             |                 |
> |<==================|                 |
> | 2)SYN_ACK         |                 |
> |==================>|                 |
> | 3)ACK             |                 |
> |==================>|                 |
> | 4)SYN_STREAM (1)  |                 |
> |                   |================>|
> |                   | 5) SYN          |
> |                   |<================|
> |                   | 6) SYN_ACK      |
> |                   |================>|
> |                   | 7) ACK          |
> |                   |================>|
> |                   | 8) SYN_STREAM(1)|
> |                   |<================|--
> |                   | 9) SYN_STREAM(2)| |
> |                   |  uni=true       | |
> |<==================|                 | |
> | 10) SYN_STREAM(2) |                 | |
> |  uni=true         |                 | | Content Push
> |                   |<================| |
> |                   | 11) SYN_REPLY(1)| |
> |<==================|                 | |
> | 12) SYN_REPLY(1)  |                 | |
> |                   |                 | |
> |                   |<================| |
> |<==================| 13) DATA (2,fin)|--
> | 14) DATA (2,fin)  |                 |
> |                   |                 |
> |                   |                 |
>
> My question is: what does this picture look like if Client's A and B
> concurrently request /index.html?
>
> With HTTP/1.1, static resources can be pushed off to CDN's, stored in
> caches, distributed around any number of places in order to improve overall
> performance. Suppose /index.html is cached at the RP. Is the RP expected to
> also cache the pushed content? Is the RP expected to keep track of the fact
> that /foo.css, images/a.jpg and /video/a.mpg were pushed before and push
> those automatically from it's own cache when it returns the cached instance
> of /index.html? If not, when the caching proxy returns index.html from it's
> cache, A and B will be forced to issue GETs for the static resources
> defeating the purpose of pushing those resources in the first place.
>
> In theory, we could introduce new Link rels in the same spirit as
> http://tools.ietf.org/html/draft-nottingham-linked-cache-inv-03 that tell
> caches when to push cached content... e.g.
>
> SYN_STREAM
>   id=2
>   unidirectional=true
>   Content-Location: http://example.org/images/foo.jpg
>   Content-Type: image/jpeg
>   Cache-Control: public
>   Link: </index.html>; rel="cache-push-with"
>
> What does cache validation look like for pushed content? E.g. what happens
> if the cached /index.html is fresh and served from the cache but the
> related pushed content also contained in the cache is stale?
>
> I'm sure I can come up with many more questions, but it would appear to me
> that server push in SPDY is, at least currently, fundamentally incompatible
> with existing intermediate HTTP caches and RP's, which is definitely a
> major concern.
>
> As a side note, however, it does open up the possibility for a new type of
> proxy that can be configured to automatically push static content on the
> Origin's behalf... e.g. A SPDY Proxy that talks to a backend HTTP/1.1
> server and learns that /images/foo.jpg is always served with /index.html so
> automatically pushes it to the client. Such services would be beneficial in
> general, but the apparent incompatibility with existing deployed
> infrastructure is likely to significantly delay adoption. Unless, of
> course, I'm missing something fundamental :-)
>
> 2. While we on the subject of Reverse Proxies... the SPDY spec currently
> states:
>
>    When a SYN_STREAM and HEADERS frame which contains an
>    Associated-To-Stream-ID is received, the client must
>    not issue GET requests for the resource in the pushed
>    stream, and instead wait for the pushed stream to arrive.
>
>    Question is: Does this restriction apply to intermediaries like Reverse
> Proxies? For instance, suppose the server is currently pushing a rather
> large resource to client A and Client B comes along and sends a GET request
> for that specific resource. Assume that the RP ends up routing both
> requests to the same backend Origin server. A strict reading of the above
> requirement means that the RP is required to block Client B's get request
> until the push to Client A is completed. Further, the spec is not clear if
> this restriction only applies for requests sent over the same TCP
> connection. Meaning, a strict reading of this requirement means that even
> if the RP opens a second connection to the Origin server, it is still
> forbidden to forward Client B's GET request until Client A's push has been
> completed.
>
>
> - James
>
>
>
Received on Saturday, 21 July 2012 19:42:19 UTC