Re: Media Queries and optimizing what data gets transferred from Ilya Grigorik on 2013-01-28 (www-style@w3.org from January 2013)

From: Ilya Grigorik <ilya@igvita.com>
Date: Mon, 28 Jan 2013 11:10:45 -0800
To: Henri Sivonen <hsivonen@iki.fi>
Cc: www-style@w3.org
Message-ID: <CAKRe7JFDvTJO2CLuTZmWUUxx3e1PLnUr0ZqMBvojBxc93pz0Cg@mail.gmail.com>
On Mon, Jan 28, 2013 at 5:23 AM, Henri Sivonen <hsivonen@iki.fi> wrote:

> On Sun, Jan 27, 2013 at 1:38 AM, Ilya Grigorik <ilya@igvita.com> wrote:
> > (1) It is *not* a question either/or. To the extent possible, both a
> > client-side and a server-side solution should be available for content
> > adaptation. Posing Client-Hints as "breaking" or impeding client-side
> > adaption misses the point entirely.
>
> I disagree. It doesn't make sense to provide the facilities for
> server-side adaptation if providing facilities for client-side
> adaptation can yield better results. For example, I think we should
> not have provided an Accept-Codecs mechanism in addition to the
> multi-<source> client-driven alternative selection in <video>. In
> particular, when the client chooses from a set of alternatives offer
> to it, where each alternative has a distinct URL, the result is
> maximally friendly to intermediate caches including CDNs. In the
> multi-<source> design, videos can be on a dumb CDN that neither needs
> to implement codec negotiation logic nor needs to be able to consult
> an origin server for such negotiation. The CDN only needs to be able
> to serve files in the setting where there is a single file of bytes
> corresponding to each URL.
>

Good thing we don't all have to agree. :-)

(1) The server can do much better job of optimizing images - that's a fact.
We know this from first hand experience with WPO products.
(2) Even if you provided all the markup, preload scanners in browsers don't
always have all the viewport / CSS information at hand when kicking off
request prefetching. Server negotiation resolves this.
(3) Your argument for "friendly to intermediate caches" is bogus. Vary
exists for a reason. CDN's don't wan to be "dumb" either - they're all
increasingly offering edge optimization services. And guess what: people
are buying their services! While, as I said earlier, I'm all for markup
solutions, in practice many people are (rightfully) willing to pay a CDN or
invest into own deployments, which can automate the problem. The cost of
your development time >>> cost of automating the problem.

Once again.. we can have a philosophical discussion about which we prefer,
but you have to look at what people are doing in real-life, and where the
dollars and cents are going.


>  > (2) Some things are far better handled by the server, not by the client.
> > Things like resizing images to optimal width and height, in the world of
> > ever exploding form factors is easily automated at the server level, and
> > leads to massive bloat on the client. I don't mean to pick on
> picturefill,
> > but just for the sake of an example:
> > https://github.com/scottjehl/picturefill#hd-media-queries
>
> Does anyone with a popular site (but not as popular as Google)
> actually want to rescale images on the fly (as opposed to letting
> Opera do it for them)? I really doubt that on-the-fly scaling to
> precise client dimensions would become popular across the Web.
> Instead, I expect sites to generate handful of pre-scaled image files
> and then arrange something for choosing between them. In such a
> scenario where the set of pre-scaled images is known in advance, the
> design trade-offs become similar to the <video> codec negotiation
> case. If each scaled image has its distinct URL, the site can declare
> all the available scaled images in HTML and let the browser choose.
> And this will again work efficiently without any logic in CDNs.
>

Yes, plenty of sites, both large and small. A random sample of examples:

http://developer.wordpress.com/docs/photon/
http://docs.sencha.io/current/index.html#!/guide/src
http://adaptive-images.com/

Once again, let's not pose CDN's as an adversary - they're not. Instead,
they are the ones who can help us make the web faster.


> As for bloat in terms of the number of URLs that need to be offered to
> the client, I don't expect the number of different pre-scaled images
> to keep growing and growing. At some point, sites will just opt to
> letting some devices scale down a slightly larger image or scale up a
> slightly smaller image. Also, the example you are referring to tries
> to handle art direction and mere scaling in the same example. Of
> course, you get more combinations when you do art direction. Without
> art direction, you could use the same image in both the X width retina
> and the X*2 non-retina cases.
>
> I think the example looks complex, because it tries to shoehorn both
> HiDPI support and width-based art direction in the same syntax. I
> don't think it follows that an HT TP-based solution is needed. It
> might be that a better solution would be separating the syntaxes for
> HiDPI and art direction.
>

I'm not convinced by "what I think" arguments. I'm commenting on what
exists - and it's not looking pretty. Once again, the fact that customers
are willing to go to an EdgeCast / Akamai / etc., and put down dollars and
cents for this problem to be solved, tells me that this is a real problem.


> > Having said that, images is just one example. Client-Hints is a generic,
> > cache-friendly transport for client-server negotiation.
>
> Why do you characterize Client-Hints as cache-friendly? It seems to me
> that with Vary: Client-Hints, even the local cache gets invalidated if
> the user rotates the device 90° or if the current bandwidth and
> estimates changes.
>

It's cache-friendly compared to any other existing alternative... which is
to say, no alternative for actual caching. Either you're stuck with UA
sniffing, or you're relying on cookies (which are also not cache friendly,
and don't work for cross-domain cases).

See earlier discussions here:
https://docs.google.com/document/d/1xCtGvPbvVLacg45MWdAlLBnuWa7sJM1cEk1lI6nv--c/edit

Finally, yes, by definition Vary *will* add more variants of the resource
to your cache. But the fact that you can cache to begin with is already a
breakthrough. From there, we can start talking about which variables should
be present to avoid unnecessary fragmentation.

> Not true. The whole point of Client-Hints is to enable caches to perform
> > "Vary: Client-Hints". What you've described is how the process works
> > today... the requests are forced down to origin because we don't have a
> > clean cache key to vary on.
>
> So if the cache knows that the response from the origin server varies
> depending on what value the request header Client-Hints has, the cache
> won't be able to use the cached response when a new request comes with
> a different Client-Hints value—even from the same browser after its
> bandwidth estimate has changed slightly—without consulting with the
> origin server.
>

Let's take bw off the table - I'm removing it from the spec. Before we talk
about BW in Client-Hints, we need to fix NetInfo api.

In short: we're on the same page with respect to having to define variables
which will result in useful adaptions, while minimizing the number of
variants. That's a good discussion to have.


> >>  * If the origin server doesn't get ETags right, intermediate caches
> >> end up having a distinct copy of the data for each distinct
> >> Client-Hints header value even if there is a smaller number of
> >> different data alternatives on the origin server.
> >
> > Etags has *nothing* to do with this, and ETags is also not a mechanism to
> > vary different responses to begin with.
>
> Have I misunderstood how HTTP cache validation works? If the cache
> already has a response entity with an ETag and Vary: Client-Hints and
> a new response to the cache comes in with a different value for
> Client-Hints, isn't the cache supposed to issue a conditional request
> with the ETag back to the origin server so that the origin server gets
> to indicate whether the new Client-Hints value results in a different
> response body or in the same one the cache already has?
>

Yes, that's correct, that's the behavior with Vary. Having said that, a
knitpick: ETag is an opaque token, and if the resource has been changed it
should probably be a different value anyway.

Once again, I think your concern is fragmentation - fair enough, see my
earlier comment.


 >>  * Sending any HTTP header incurs extra traffic for all the sites that
> >> don't pay attention to Client-Hints. That would be the whole Web at
> >> least at first. That is, an HTTP-based solution involves a negative
> >> externality for non-participating sites.
> >
> > This is easily addressed by making it an opt-in mechanism for HTTP 1.1.
>
> How would you handle the initial contact with the site? How would
> opting into Client-Hints be better than setting a cookie? You
> mentioned that cookies don't work cross-origin. How would Client-Hints
> opt ins work cross origin?
>

See my earlier document for why not cookies.

For opt-in, a mechanism similar to Alternate-Protocol can be provided:
http://www.chromium.org/spdy/spdy-protocol/spdy-protocol-draft2#TOC-Server-Advertisement-of-SPDY-through-the-HTTP-Alternate-Protocol-header



> > Further, the "cost" of upstream bytes, which is in the dozens of bytes,
> is
> > easily offset by saving hundreds of kilobytes in the downstream (in case
> of
> > images). The order of magnitude delta difference is well worth it.
>
> That might be true if you consider the cost in the context of a site
> that actually pays attention to Client-Hints. But there's the
> externality that you end up sending Client-Hints even to sites that
> don't pay attention to it (unless it's opt in).
>

30 bytes or less + opt-in... Plus, as Patrick already pointed out, the 30
bytes are not overflowing the CWND. I'm with you on concerns of adding any
extra bytes, but this is not an argument against Client-Hints.


> > correctly: wrong formats, wrong sizes, etc. Automation solves this
> problem.
> > While not 100% related to this discussion, see my post here:
> > http://www.igvita.com/2012/12/18/deploying-new-image-formats-on-the-web/
>
> This kind of "server automation will save us" argument would be easier
> to buy if someone had already demonstrated a Web server that
> automatically runs pngcrush on all PNG files and compresses JPEGs with
> a better encoder than the one in libjpeg.

Why isn't such a server in the popular use and why should we expect a
> server that automatically scales images in response to Client-Hints to
> enter into popular use?
>

Oh hai: https://developers.google.com/speed/pagespeed/mod

200K+ sites using it + 3rd party integrations (EdgeCast, GoDaddy,
Dreamhost) and others...



> > Once again, they're not exclusive. If you don't have a server that can
> > support image optimization, you should be able to hand-tune your markup.
> I'm
> > all for that.
>
> "Not exclusive" means that there's more stuff—hence bad for
> learnability of the platform.
>

Nobody is forcing you to use it. If you only want to learn the markup way,
then please be my guest!

ig
Received on Monday, 28 January 2013 19:12:01 UTC