W3C home > Mailing lists > Public > whatwg@whatwg.org > August 2013

Re: [whatwg] Zip archives as first-class citizens

From: Jake Archibald <jaffathecake@gmail.com>
Date: Thu, 29 Aug 2013 13:19:17 +0100
Message-ID: <CAJ5xic-npAJAhWV4sV7ut=5Vu3bTfuxfeXLtjtnvj9B6st8mAQ@mail.gmail.com>
To: Anne van Kesteren <annevk@annevk.nl>
Cc: WHATWG <whatwg@whatwg.org>
On 28 August 2013 14:32, Anne van Kesteren <annevk@annevk.nl> wrote:

> * Using a sub-scheme (zip) with a zip-path (after !):
> zip:http://www.example.org/zip!image.gif
> * Introducing a zip-path (after %!): http://www.example.org/zip%!image.gif
> * Using media fragments: http://www.example.org/zip#path=image.gif
>
> High-level drawbacks:
>
> * Sub-scheme: requires changing the URL syntax with both sub-scheme
> and zip-path.
> * Zip-path: requires changing the URL syntax.
> * Fragments: fail to work well for URLs relative to a zip archive.
>

I prefer the zip-path. It works with relative urls in & out of the zip.

Causing a network error in existing browsers is a shame. It'd be great if
older browsers requested a url which included the zip location & the file
within, so the server could unpack the zip and deliver the right file.
Whereas modern browsers would request the zip & handle the unpacking
clientside. Although I guess that would break a load of stuff.

Is this syntax compatible with datauris? As in, will I be able to build a
datauri of a zip containing a 2x2 png and access the png directly? Feels
like a nice feature-detect.

If I navigate to something within a zip file, how is it rendered? I'm
assuming content-type isn't stored within a zip file (am I wrong?), so how
can the browser differentiate a text file from an html file from a pdf etc
etc.

Will CORS headers on the zip apply to all it's contents? I guess they would.

I have some higher-level concerns, feels like we're introducing an
anti-pattern:

Right now if I request a page it'll progressively render as assets are
downloaded, the html will render before it's fully downloaded, as will
images etc. As Eric Uhrhane points out, this can't happen with zip. If my
CSS is in a zip with JS and images, page rendering is blocked on the whole
zip rather than just the CSS.

Also, if I change any file within my zip, the whole zip gets a cache-miss
next time the user visits, rather than just the files that changed.

As we get new protocols which reduce the request overhead, it'll be faster
to transfer lots of smaller files that can cache and execute independently.
A zip file feels a step backwards.

I understand the ES modules use-case, but I think that's better solved with
an ES module-specific use of the url fragment, eg
combined-file.js#module-identifier.


> As for nested zip archives. Andrea suggested we should support this,
> but that would require zip-path to be a sequence of paths. I think we
> never went to allow relative URLs to escape the top-most zip archive.
> But I suppose we could support in a way that
>
>   %!test.zip!test.html
>

Why don't we want urls to escape the topmost zip archive?

Nested zip support sounds fair, but why is the 2nd zip boundary ! rather
than %!?

Eg:
<img src="zipfile%!cat.gif">

Would the above work if the current url is
http://whatever.com/zip%!index.html, making the image url
http://whatever.com/zip%!zipfile%!cat.gif

Cheers,
Jake.
Received on Thursday, 29 August 2013 12:19:43 UTC

This archive was generated by hypermail 2.3.1 : Thursday, 29 August 2013 12:19:43 UTC