[whatwg] Browser Bundled Javascript Repository from Joseph Pecoraro on 2009-06-15 (public-whatwg-archive@w3.org from June 2009)

From: Joseph Pecoraro <joepeck02@gmail.com>
Date: Mon, 15 Jun 2009 13:55:31 -0400
Message-ID: <7697745F-4290-4DF6-819D-A7D93C5DC972@gmail.com>

Hey Guys,

This is my first time on the list, I searched the Archives but I
didn't see anything like this so I apologize if I missed any earlier
discussion on something like this.

A while back I came across this two paragraph blog post titled
"Browsers Should Bundle JS Libraries:"
http://fukamachi.org/wp/2009/03/30/browsers-should-bundle-js-libraries/

The premise is basically that browsers are repeatedly downloading the
same javascript frameworks from different domains over and over every
day. In the author's own words:
"All popular, stable Javascript libraries, all open source. All
downloaded tens of millions of times a day, identical code each time."

Below is a summary and expansion of my comments/ideas from the
discussion on the above blog article.

A typical solution to the problem, and one that works right now in
browsers, is that if you require a javascript library on your website
you can point to a "publicly available" version of that library. If
enough sites use this public URI then the browser will continually be
using that URI and it will be cached and reused by the browser. This
is the idea behind Google's Hosted Libraries:
http://code.google.com/apis/ajaxlibs/

There are some arguments against using Google's Hosted Libraries:
http://www.derekallard.com/blog/post/jquery-hosted-on-google-and-some-implications-for-developers/

However, I think the author makes a good point. Bundling the JS
Libraries in the Browser seems like it would require very little
space, could even be stored in a more efficient representation
(compiled bytecode for example), and would prevent an extra HTTP
Request. The problem then becomes how does a browser know
example.com's jquery.js is the same as other.com's jquery.js. The
developer should opt-in to telling the browser it wants to use a
certain JS Library version that the browser may already know about.

The way I thought about it was by adding an attribute to the <script>
tag. In my comments, I used the "rel" attribute because of
developer's familiarity with it in other tags, but it could (and
probably should) be an entirely new attribute. The value inside of
this attribute would need to be a unique identifier for a possible
script available in the browser's repository. The "src" attribute
should still point to a hosted version of the script in case this
attribute is unsupported (ignored) or the script is not found in the
repository (not-bundled).

For Example:

<script rel="A56F2CED6..." src="..." />

<script rel="jquery-1.2.3" src="..." />

Here the "rel" attribute's value is a standard identifier for a
particular version of the JQuery JS Library. The browser could check
its Repository to see if it has it. If found, no request is needed
and it can load its local version. If not found it can proceed like
normal using the "src" attribute to download the script.

----

Pros:

- Future-Proof: Adding a new attribute, or using a currently ignored
attribute, on the script tag would make this a safe addition that
works fine in older browsers (backwards compatible) and works
instantly in supported browsers.
- Developer Opt-In: Developers that choose not to use this feature
could just ignore it.
- Pre-Compiled: By bundling known JS Libraries with the browser, the
browser could store a more efficient representation of the file. For
instance pre-compiled into Bytecode or something else browser specific.
- Less HTTP Requests / Cache Checks: If a library is in the repository
no request is needed. Cache checks don't need to be performed. Also,
for the 100 sites you visit that all send you the equivalent jquery.js
you now would send 0 requests. I think this would be enticing to
mobile browsers which would benefit from this Space vs. Time tradeoff.
- No 3rd Party is Gathering Statistics: One of the arguments against
using Google's Hosted Libraries is that you send them some data if you
are indeed using their scripts and a client downloads from them
(Referrer, etc.). Here there is no 3rd party, its just between the
client browser and domain.
- Standardizing Identifier For Libraries: Providing a common
identifier for libraries would be open for discussion. The best idea
I've had would be to provide the SHA1 Hash of the Desired Release of a
Javascript Library. This would ensure a common identifier for the
same source file across browsers that support the feature. This would
be useful for developers as well. A debug tool can indicate to a
developer that the script they are using is available in the Browser
Repository with a certain identifier.
- Repository Can Grow Dynamically - Assuming this is a desirable
feature that shows some promise, the browser repository can grow
dynamically. Browsers can count the number of times they have seen
equivalent source files (SHA1 hash values), or seen identifiers they
didn't have in their repository and can grow (or shrink) accordingly.
Likewise official sources can distribute new script/identifiers like
browsers currently distribute lists of unsafe websites. This may not
even need to be the browser's responsibility, in the original article
I envisioned a Firefox extension, with access to the repository via an
API, that would handle such dynamic updates.

Cons:

- May Not Grow Fast Enough: If JS Libraries change too quickly the
repository won't get used enough.
- May Not Scale: Are there too many JS Libraries, versions, etc making
this unrealistic? Would storage become too large?

Verifying any Realistic Improvements:

- Implement a Repository.
- Fill the repository with the popular JS Libraries used in the top
100 websites (subjective) or web applications. Alter the static HTML
to contain the standard identifiers on the script tags to take
advantage of the repository.
- Run a benchmark of cold and hot (cached) loads of these 100 pages
with and without the repository.
- Compare times, memory, requests, etc.

-----

I hope this is relevant to HTML5 and WHATWG. Unfortunately, I don't
have the experience or knowledge to know if such a repository would
provide browsers with any noticeable performance improvements. My
hope is that it would. Maybe someone on the list can offer their
opinion on wether or not they think this would even be worth
implementing.

Thank you for taking the time to read this! I look forward to hearing
feedback.
- Joe

Received on Monday, 15 June 2009 10:55:31 UTC