- From: Andrea Marchesini <baku@mozilla.com>
- Date: Wed, 15 Aug 2012 04:24:29 -0700 (PDT)
- To: whatwg@whatwg.org
Thanks for your feedback. When I was implementing the ArchiveAPI, my idea was to have a generic Archive API and not just a ZIP API. Of course the current implementation supports just ZIP but in the future we could have support for more formats. > This interface is problematic. Since ZIP files don't have a standard > encoding, filenames in ZIPs are often garbage. This API requires > that filenames round-trip uniquely, or else files aren't accessible > t all. For example, if you have two filenames in CP932, "日" and "本", > but the encoding isn't determined correctly, you may end up with two > files both with a filename of "??". Either you can't open either > file, or you can only open one of them. This isn't theoretical; I > hit ZIP files like this in the wild regularly. I agree. I was thinking that the default encoding for filenames is: UTF-8. If filename is not a valid UTF-8 string we can use the caller-supplied encoding: var reader = new ArchiveReader(blob, "Windows-1252"); If this fails, this filename/file will be excluded from the results. > It should be possible to get the CRC32 of files, which ZIP stores in > the central directory. This both allows the user to perform checksum > verification himself if wanted, and all the other variously useful > things about being able to get a file's checksum without having to > read the whole file. can we have 'generic' archive API supporting CRC32? Andrea
Received on Wednesday, 15 August 2012 11:24:55 UTC