Re: [AudioTF] Agenda 2018-12-14

There seem to be some misconceptions around the use of the signature in
EPUB files that might need some clearing up. I hear that it is a pain to
add them and are of no use to reading systems, particularly for the
streaming case. This is largely true, but also completely misses the point.
If the only type of zip based binary blob you deal with is EPUB, then the
signature is useless since the only sensible thing for you to do with it is
treat it as an EPUB. However, if you are a more general system that can
open multiple file types you want to know which path to take and
potentially which module to route the data to. For instance, Google (*not*
Play books) makes use of the signature in a number of other products to
figure out what to do with EPUB files. I expect there are other companies
that have products which do the same thing (and are *not* Reading Systems).
This is true of most file types - if your blob starts with 137 80 78 71 13
10 26 10 it is probably a PNG, and in the absence of a file extension is a
good way to set a mimetype for that file. The same is true of our signature
- if we get something that looks like a zip, but has no file extension (or
the wrong one) we can easily check for a specific byte sequence at a known
location. That is why the file MUST be first and MUST NOT be encoded.
Removing either of those restrictions makes the file entirely useless to
everyone.

So:
1. The EPUB signature file is useless for streaming epubs.
2. The EPUB signature file is useless for EPUB Reading Systems that have
another way to identify the file type (eg the only type of ZIP archive
supported).
3. The EPUB signature file is useless for people creating content, since
they obviously know what they just created.
4. The EPUB signature file is USEFUL for general systems that want to
handle an array of files, either internally or through use of external
modules (eg an add-on editor component, or an OS that wants to route the
file to the correct app)

Typically the stakeholders here represent items 1-3, but item 4 still seems
like a useful case and is one that is widely supported in other file types.
Items 1-3 apply equally well to PNG files, or other file types with a
signature.

As for difficulty in generating the file - perhaps. Most publishers seem to
have figured out how to do it. I am not sure how many person hours they
waste daily adding signatures to epub files. I expect (hope) the answer is
0. It is fairly trivial to do from the command line in any *nix
environment. For dedicated epub creation tools - well, again, you do this
once and it then just works. I expect we have spent more time discussing
the issue than engineers have spent adding the file.

My only real concern with the signature is claiming that a WP is an EPUB,
which seems like a good reason to change it.

On Thu, Dec 13, 2018 at 7:58 AM Dave Cramer <dauwhe@gmail.com> wrote:

> On Tue, Dec 11, 2018 at 6:40 AM Matt Garrish <matt.garrish@gmail.com>
> wrote:
>
>> The one “feature” of OCF that everyone seems to hate is that it requires
>> the mimetype file be the first in the ZIP container. That makes packaging
>> an EPUB more complicated than just zipping up all the files, since the
>> mimetype typically won’t get inserted first in general zipping scenarios.
>>
>>
>>
>> If we remove this restriction from OCF 3.2, then we possibly break the
>> loading of publications in reading systems that won’t process a publication
>> without first encountering the mimetype. I have no idea how many that is,
>> or if it’s common to fall back to finding a mimetype elsewhere in the zip
>> if it’s not first.
>>
>>
>>
>
> Yesterday I made an EPUB without a mimetype, and just zipped it. Changed
> the file extension to EPUB. It would not load at all in iBooks, Adobe
> Digital Editions, Kobo, or AZARDI. It worked in Google Play Books. Kindle
> Previewer did process it, and the resulting Mobi worked in Kindle/Mac.
>
> Dave
>
>
>

Received on Thursday, 13 December 2018 16:38:59 UTC