W3C home > Mailing lists > Public > whatwg@whatwg.org > August 2013

Re: [whatwg] Zip archives as first-class citizens

From: Anne van Kesteren <annevk@annevk.nl>
Date: Thu, 29 Aug 2013 14:02:48 +0100
Message-ID: <CADnb78ieWieX_eZ44QB09tMW8_FP4YNFxCZNs4hgUrkd8drBGQ@mail.gmail.com>
To: Jake Archibald <jaffathecake@gmail.com>
Cc: WHATWG <whatwg@whatwg.org>
On Thu, Aug 29, 2013 at 1:19 PM, Jake Archibald <jaffathecake@gmail.com> wrote:
> Causing a network error in existing browsers is a shame. It'd be great if
> older browsers requested a url which included the zip location & the file
> within, so the server could unpack the zip and deliver the right file.
> Whereas modern browsers would request the zip & handle the unpacking
> clientside. Although I guess that would break a load of stuff.

Picking something that could occur in paths seems problematic.


> Is this syntax compatible with datauris? As in, will I be able to build a
> datauri of a zip containing a 2x2 png and access the png directly? Feels
> like a nice feature-detect.

Yes, if we decide zip-path over sub-scheme we should support it for
all URLs, much like fragment is.


> If I navigate to something within a zip file, how is it rendered? I'm
> assuming content-type isn't stored within a zip file (am I wrong?), so how
> can the browser differentiate a text file from an html file from a pdf etc
> etc.

Using the file extension and no sniffing of any sorts, falling back to
application/octet-stream. That determines the Content-Type. Whether
the loading context (e.g. <img>) ignores that is up to the loading
context. https://etherpad.mozilla.org/zipurls outlines this.


> Will CORS headers on the zip apply to all it's contents? I guess they would.

It would make sense to forward them, agreed. We'd have to add that to
the Fetch layer when constructing a response.


> Right now if I request a page it'll progressively render as assets are
> downloaded, the html will render before it's fully downloaded, as will
> images etc. As Eric Uhrhane points out, this can't happen with zip. If my
> CSS is in a zip with JS and images, page rendering is blocked on the whole
> zip rather than just the CSS.
>
> Also, if I change any file within my zip, the whole zip gets a cache-miss
> next time the user visits, rather than just the files that changed.
>
> As we get new protocols which reduce the request overhead, it'll be faster
> to transfer lots of smaller files that can cache and execute independently.
> A zip file feels a step backwards.

A great server setup will be better, agreed. I don't think we can
assume everyone will have one. Also, to some extent we'll have to
figure out how people will want to deploy this new primitive and then
see if there's ways to improve that. I expect this would be used by
less-capable ad-servers, assets for games, toying with EPUB and OOXML,
etc.


> I understand the ES modules use-case, but I think that's better solved with
> an ES module-specific use of the url fragment, eg
> combined-file.js#module-identifier.

I don't really see how that would work. ES modules are standalone
files and don't have dedicated syntax other than import/export. For ES
modules it would maybe work to only support extracting a file out of
zip archive, but even in ES modules you'd want to reference other
modules that might be in the zip archive. And you'd definitely want to
do that for CSS.


> Why don't we want urls to escape the topmost zip archive?

It seems more likely to me that would be a mistake than actually
intended. But it depends on what use case you envision as Boris
pointed out earlier. You can always point elsewhere by using a <base>
or absolute URL though.


> Nested zip support sounds fair, but why is the 2nd zip boundary ! rather
> than %!?

My idea was to indicate in the URL what the topmost zip archive is. It
doesn't matter much though, it could be the same.


> Eg:
> <img src="zipfile%!cat.gif">
>
> Would the above work if the current url is
> http://whatever.com/zip%!index.html, making the image url
> http://whatever.com/zip%!zipfile%!cat.gif

Sure.


-- 
http://annevankesteren.nl/
Received on Thursday, 29 August 2013 13:03:14 UTC

This archive was generated by hypermail 2.3.1 : Monday, 13 April 2015 23:09:23 UTC