- From: <bugzilla@jessica.w3.org>
- Date: Sat, 30 Apr 2011 06:12:11 +0000
- To: public-html-bugzilla@w3.org
http://www.w3.org/Bugs/Public/show_bug.cgi?id=12569 --- Comment #5 from Wojciech Hlibowicki <wojjie2nd@gmail.com> 2011-04-30 06:12:09 UTC --- (In reply to comment #4) > This doesn't at all address the fact that the ZIP file has to be transferred > serially, byte-by-byte, meaning the files are in fact NOT downloaded in > parallel, which creates a situation where the page will load MUCH slower than > it normally would, by disabling all parallel downloading. Parallel downloading is more of a fix than a solution. Ideally, it is better for the end user and the servers if we minimize connections and just have an overall higher throughput, but the problem is that with every file requested, there is the overhead of sending each request with headers, which if done in parallel the delay of the sending alone becomes negligible as you are making more use of both directions of communication. The other issue it solves is that it gives you a greater slice of the bandwidth on either side for systems set up to evenly divide network resources per connection, so more connections = greater overall slice for the application in question. As you can not solve all the issues in another means, a smart webmaster can still divide the resources into multiple packages, with the highest priority items being added first to the packages. This would also solve some of the 'waste' you mentioned when modifying a package for a tiny image. To be honest, most people will figure out how to setup these 'packages' and won't be merging everything under the moon, especially if they anticipate changes. > As for your idea of having the files download in parallel somehow... the whole > point of having a single file is to prevent multiple files from being > delivered. I don't think you can have it both ways. It was an alternative suggestion, the main issue here is not multiple file downloads, is the overhead in making the requests, and the amount of data sent for each file requested from the server, which is not much, but on a home connection that is being used, it can amount to enough, that when added together might create a delay in loading all resources even when in parallel. > Now, if you're suggesting that some sort of "manifest" file could be delivered > to the browser to tell it what all the files in the ZIP are, sure... but how > will that help at all, if the browser still has to wait for each file to show > up as part of the single ZIP file stream? > > What we'd REALLY need for an idea like this to not hamper performance is for > the browser (and server) to do some sort of parallel bit downloading of a > single file, similar to bit-torrent kind of thing, where a single giant ZIP > file could be delivered in simultaneous/parallel chunks, bit-by-bit. If you > wanna argue for THAT, sure, go ahead. That'd be awesome. But it's a LONG way > from being possible, as all web servers and all web browsers would have to > support that. If either a webserver or a browser didn't, and the HTML served up > suggested that single manifest.ZIP file, then this view of the site would be > excruciatingly slow because all resources would default to the serial loading, > worse than like IE3.0 kinda days. I would not argue for that, as that would be more complex for the browsers to implement, and more problems to iron out, along with the fact that a heavily used website will become quickly boggled down as you increase the number of connections. It would be more cost efficient to minimize number of connections. > Moreover, the separate cacheability is a concern. If I have a big manifest.ZIP > file with all my resources in it, and I change one tiny file in that manifest, > then doesn't the entire ZIP file change (it's file signature/size/timestamp > certainly does). So, the browser has to re-download the entire ZIP file. Lots > and lots of wasted download of stuff that didn't change. Well, you can always contain version numbers or hashes within the ZIP file, and have the browser download the difference between its version and the servers, perhaps you can expand on the technology from bit torrent and download only the diff of the two ZIP files. Or just have faith that a webmaster will be competent enough not to make a big manifest with everything under the moon. Ideally, the packages should be split up into resources used on the whole site (main javascript libraries, layout specific images, and global CSS files), then check how many resources are required per individual page and either load them the way you would now, or package them into another package or two. The other ideas are not to remove parallel downloading, but to optimize it further by allowing the browser to request multiple resources in serial, and then do the same across multiple parallel connections. It is mainly to reduce the amount of data sent due to headers by sharing headers, and before you point out that headers can change from request to request, you can include the ability to mark what elements can be transfered in this way by stating which can be grouped and which can possibly change cookies or might change due to data changing during a load. Perhaps another simpler solution would to be to have a way to mark up certain elements to state that minimal headers can be sent to request it, like no referrer/cookies/user agent/etc. > All in all, I think this idea is flawed from the start. I think it has no > chance of actually working to improve performance in the real world. There's > just too many fundamental paradigm problems of packaging files up together that > loses the huge performance ability of files to be separately loaded in parallel > and separately cacheable. Any paradigm where you lose those two things is just > not going to work. The main problem I want to solve here is the amount of data sent with each request and the round trip time required for making a request. Since most internet connections typically have an upstream speed of 1/10th of download speed, we can theorize that the typical 300-1000 bytes required to make a request could be translated to 3kB-10kB of data we could be downloading instead, in ideal conditions, and even in not so ideal conditions, you can typically download more than you can send in the same time period, so any savings in the amount sent means great boosts in the amount you can receive. Which is obvious, as most people have been merging javascript and css files, and creating sprite images to reduce the overall requests. > NOW, if we're merely talking about saving the TCP overhead of establishing a > new connection for each file (but that still files would be requested > individually, and in parallel), then that IS something valuable. And it already > exists and is in wide-spread use. It's called "Keep-Alive". I am not sure whether you are being sarcastic or not, but I will give you the benefit of the doubt and assume you are being genuine. My main concern is not over connection overhead, it is the overhead involved in making a request to the server. To be honest, you can point flaws in every idea and system and can always attribute it to some theoretical incompetent person out there, and to be honest, you will always have those, but should we really stifle progress over it? A competent webmaster would take a package system and utilize it to greatly increase the efficiency of a site, not to mention the amount saved alone from not having to always sprite/merge images, along with the extra CSS rules/bytes you don't have to do by packaging all the images into one package instead. Ideally a webmaster would take this idea, and create 2-3 packages that are optimized for the overall loading and speed of the site, which could easily double the speed of a page over the best and most extreme optimization techniques currently available. -- Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug.
Received on Saturday, 30 April 2011 06:12:13 UTC