Re: ZIP archive API?

On Fri, May 3, 2013 at 12:12 PM, Paul Bakaus <pbakaus@zynga.com> wrote:
>
>
> From: Florian Bösch <pyalot@gmail.com>
> Date: Fri, 3 May 2013 21:05:17 +0200
> To: Jonas Sicking <jonas@sicking.cc>
> Cc: Paul Bakaus <pbakaus@zynga.com>, Anne van Kesteren <annevk@annevk.nl>,
> Webapps WG <public-webapps@w3.org>, Charles McCathie Nevile
> <chaals@yandex-team.ru>, Andrea Marchesini <amarchesini@mozilla.com>
>
> Subject: Re: ZIP archive API?
>
> It can be implemented by a JS library, but the three reasons to let the
> browser provide it are Convenience, speed and integration.
>
> Convenience is the first reason, since browsers by far and large already
> have complete bindings to compression algorithms and archive formats,
> letting the browser simply expose the software it already ships makes good
> sense rather than requiring every JS user to supply his own version.
>
> Speed may not matter to much on some platforms, but it matters a great deal
> on underpowered devices such as mobiles.

Show me some numbers to back this up and you'll have me convinced :)

Remember that on underpowered devices native code is proportionally slower too.

> Integration is where the support for archives goes beyond being an API,
> where URLs (to link.href, script.src, img.src, iframe.src, audio.src,
> video.src, css url(""), etc.) could point into an archive. This cannot be
> done in JS.
>
>
> I was going to say exactly  that. I want to be able to have a virtual URL
> that I can point to. In my CSS, I want to do something like
> "archive://assets/foo.png" after I loaded and decompressed the ZIP file in
> JS.

How does the "assets" part in the example above work? What does it
mean? Is there some registry here or something?

> Jonas, I'm intrigued – do you see a way this could be done in JS? If so,
> maybe we should build a sample. I'm still thinking the performance won't be
> good enough, particular on mobile devices, but let's find out.

You can actually do this in Gecko already. Any archive that you can
refer to through a URL, you can also reach into.

So if you have a .zip file in a Blob, and you generate a blob: URL
like "blob:123-abc" then you can read the "foo.html" file out of that
archive by using the URL "jar:blob:123-abc!/foo.html". So far this
doesn't work with ArrayBuffers since there is no way to have a URL
that refers to an ArrayBuffer.

You can even load something from inside a zip file from a server by doing
<img src="jar:http://example.com/foo/archive.zip!/image.jpg">

Something like that I definitely agree that we should standardize.

/ Jonas

> On Fri, May 3, 2013 at 8:04 PM, Jonas Sicking <jonas@sicking.cc> wrote:
>>
>> The big question we kept running up against at Mozilla is "why couldn't
>> this simply be implemented as a JS library?"
>>
>> If performance is the argument we need to back that up with data.
>>
>> / Jonas
>>
>> On May 3, 2013 10:51 AM, "Paul Bakaus" <pbakaus@zynga.com> wrote:
>>>
>>> Hi Anne, Florian,
>>>
>>> I think the first baby step, or MVP, is the unpacking that Florian
>>> mentions below. I would definitely like to have the API available on both
>>> workers and normal context.
>>>
>>> Thanks,
>>> Paul
>>>
>>> From: Florian Bösch <pyalot@gmail.com>
>>> Date: Fri, 3 May 2013 14:52:36 +0200
>>> To: Anne van Kesteren <annevk@annevk.nl>
>>> Cc: Paul Bakaus <pbakaus@zynga.com>, Charles McCathie Nevile
>>> <chaals@yandex-team.ru>, public-webapps WG <public-webapps@w3.org>, Andrea
>>> Marchesini <amarchesini@mozilla.com>
>>> Subject: Re: ZIP archive API?
>>>
>>> I'm interested a JS API that does the following:
>>>
>>> Unpacking:
>>> - Receive an archive from a Dataurl, Blob, URL object, File (as in
>>> filesystem API) or Arraybuffer
>>> - List its content and metadata
>>> - Unpack members to Dataurl, Blob, URL object, File or Arraybuffer
>>>
>>> Packing:
>>> - Create an archive
>>> - Put in members passing a Dataurl, Blob, URL object, File or Arraybuffer
>>> - Serialize archive to Dataurl, Blob, URL object, File or Arraybuffer
>>>
>>> To avoid the whole worker/proxy thing and to allow authors to selectively
>>> choose how they want to handle the data, I'd like to see synchronous and
>>> asynchronous versions of each. I'd make synchronicity an argument/flag or
>>> something to avoid API clutter like packSync, packAsync, writeSync,
>>> writeAsync, and rather like write(data, callback|boolean).
>>>
>>> - Pythons zipfile API is ok, except the getinfo/setinfo stuff is a bit
>>> over the top: http://docs.python.org/3/library/zipfile.html
>>> - Pythons tarfile API is less clutered and easier to use:
>>> http://docs.python.org/3/library/tarfile.html
>>> - zip.js isn't really usable as it doesn't support the full range of
>>> types (Dataurl, Blob, URL object, File or Arraybuffer) and for asynchronous
>>> operation needs to rely on a worker, which is bothersome to setup:
>>> http://stuk.github.io/jszip/
>>>
>>> My own implementation of the tar format only targets array buffers and
>>> works synchronously, as in.
>>>
>>> var archive = new TarFile(arraybuffer);
>>> var memberArrayBuffer = archive.get('filename');
>>>
>>>
>>>
>>> On Fri, May 3, 2013 at 2:37 PM, Anne van Kesteren <annevk@annevk.nl>
>>> wrote:
>>>>
>>>> On Thu, May 2, 2013 at 1:15 AM, Paul Bakaus <pbakaus@zynga.com> wrote:
>>>> > Still waiting for it as well. I think it'd be very useful to transfer
>>>> > sets
>>>> > of assets etc.
>>>>
>>>> Do you have anything in particular you'd like to see happen first?
>>>> It's pretty clear we should expose more here, but as with all things
>>>> we should do it in baby steps.
>>>>
>>>>
>>>> --
>>>> http://annevankesteren.nl/
>>>
>>>
>

Received on Friday, 3 May 2013 22:19:48 UTC