Re: [AudioTF] Agenda 2018-12-14

Yes, that seems correct. There is certainly no reason a file format must
have a signature (though, from your link it is clear many do). But the 2
arguments I have seen against it are it is hard to add and useless. I
disagree with both those points.

On Thu, Dec 13, 2018 at 10:09 AM Laurent Le Meur <laurent.lemeur@edrlab.org>
wrote:

> So Brady, you're take is that the signature file (aka mimetype file) is in
> fact a set of magic numbers <https://asecuritysite.com/forensics/magic>,
> which follows the zip signature (I found 50 4B 03 04 on Wikipedia).
> Ok, that's a logical use of it, thanks for the info.
>
> But as you say, it doesn't mean we MUST keep it in a new OCF version: it
> means that if we remove it for packaged WPs, it may be more difficult for
> general systems to filter these ou of the mass of zipped files they have to
> process.
>
> Laurent
>
>
> Le 13 déc. 2018 à 17:38, Brady Duga <duga@google.com> a écrit :
>
> There seem to be some misconceptions around the use of the signature in
> EPUB files that might need some clearing up. I hear that it is a pain to
> add them and are of no use to reading systems, particularly for the
> streaming case. This is largely true, but also completely misses the point.
> If the only type of zip based binary blob you deal with is EPUB, then the
> signature is useless since the only sensible thing for you to do with it is
> treat it as an EPUB. However, if you are a more general system that can
> open multiple file types you want to know which path to take and
> potentially which module to route the data to. For instance, Google (*not*
> Play books) makes use of the signature in a number of other products to
> figure out what to do with EPUB files. I expect there are other companies
> that have products which do the same thing (and are *not* Reading Systems).
> This is true of most file types - if your blob starts with 137 80 78 71 13
> 10 26 10 it is probably a PNG, and in the absence of a file extension is a
> good way to set a mimetype for that file. The same is true of our signature
> - if we get something that looks like a zip, but has no file extension (or
> the wrong one) we can easily check for a specific byte sequence at a known
> location. That is why the file MUST be first and MUST NOT be encoded.
> Removing either of those restrictions makes the file entirely useless to
> everyone.
>
> So:
> 1. The EPUB signature file is useless for streaming epubs.
> 2. The EPUB signature file is useless for EPUB Reading Systems that have
> another way to identify the file type (eg the only type of ZIP archive
> supported).
> 3. The EPUB signature file is useless for people creating content, since
> they obviously know what they just created.
> 4. The EPUB signature file is USEFUL for general systems that want to
> handle an array of files, either internally or through use of external
> modules (eg an add-on editor component, or an OS that wants to route the
> file to the correct app)
>
> Typically the stakeholders here represent items 1-3, but item 4 still
> seems like a useful case and is one that is widely supported in other file
> types. Items 1-3 apply equally well to PNG files, or other file types with
> a signature.
>
> As for difficulty in generating the file - perhaps. Most publishers seem
> to have figured out how to do it. I am not sure how many person hours they
> waste daily adding signatures to epub files. I expect (hope) the answer is
> 0. It is fairly trivial to do from the command line in any *nix
> environment. For dedicated epub creation tools - well, again, you do this
> once and it then just works. I expect we have spent more time discussing
> the issue than engineers have spent adding the file.
>
> My only real concern with the signature is claiming that a WP is an EPUB,
> which seems like a good reason to change it.
>
> On Thu, Dec 13, 2018 at 7:58 AM Dave Cramer <dauwhe@gmail.com> wrote:
>
>> On Tue, Dec 11, 2018 at 6:40 AM Matt Garrish <matt.garrish@gmail.com>
>> wrote:
>>
>>> The one “feature” of OCF that everyone seems to hate is that it requires
>>> the mimetype file be the first in the ZIP container. That makes packaging
>>> an EPUB more complicated than just zipping up all the files, since the
>>> mimetype typically won’t get inserted first in general zipping scenarios.
>>>
>>>
>>>
>>> If we remove this restriction from OCF 3.2, then we possibly break the
>>> loading of publications in reading systems that won’t process a publication
>>> without first encountering the mimetype. I have no idea how many that is,
>>> or if it’s common to fall back to finding a mimetype elsewhere in the zip
>>> if it’s not first.
>>>
>>>
>>>
>>
>> Yesterday I made an EPUB without a mimetype, and just zipped it. Changed
>> the file extension to EPUB. It would not load at all in iBooks, Adobe
>> Digital Editions, Kobo, or AZARDI. It worked in Google Play Books. Kindle
>> Previewer did process it, and the resulting Mobi worked in Kindle/Mac.
>>
>> Dave
>>
>>
>>
>

Received on Thursday, 13 December 2018 18:21:59 UTC