Re: CBOR Tutorial from Leonard Rosenthol on 2018-01-31 (public-publ-wg@w3.org from January 2018)

From: Leonard Rosenthol <lrosenth@adobe.com>
Date: Wed, 31 Jan 2018 16:58:20 +0000
To: Laurent Le Meur <laurent.lemeur@edrlab.org>
CC: Baldur Bjarnason <baldur@rebus.foundation>, Ivan Herman <ivan@w3.org>, Romain <rdeltour@gmail.com>, Schindler Wolfgang Dr. <w.schindler@pons.de>, "Davis, Greg" <greg.davis@pearson.com>, Ric Wright <rkwright@geofx.com>, "W3C Publishing Working Group" <public-publ-wg@w3.org>
Message-ID: <AE7489DB-8066-4321-BE67-EB12CB8B8557@adobe.com>
CBOR is a great exchange format for “over the wire” data exchange.   It is not a good format for “off the web” exchange (IMO)

Leonard

From: Laurent Le Meur <laurent.lemeur@edrlab.org>
Date: Wednesday, January 31, 2018 at 1:06 PM
To: Leonard Rosenthol <lrosenth@adobe.com>
Cc: Baldur Bjarnason <baldur@rebus.foundation>, Ivan Herman <ivan@w3.org>, Romain <rdeltour@gmail.com>, "Schindler Wolfgang Dr." <w.schindler@pons.de>, "Davis, Greg" <greg.davis@pearson.com>, Ric Wright <rkwright@geofx.com>, W3C Publishing Working Group <public-publ-wg@w3.org>
Subject: Re: CBOR Tutorial

... That makes me wonder if CBOR can be described as an interchange format in usual B2B or B2C use cases. From your assertion, it appears that it is a bundle format specialized for fetching multiple resources in an efficient manner (one hit) on an internet (http) connexion.

Cordialement,

Laurent Le Meur
EDRLab


Le 31 janv. 2018 à 02:49, Leonard Rosenthol <lrosenth@adobe.com<mailto:lrosenth@adobe.com>> a écrit :

Because of the ubiquity of compressed/gzipped HTTP responses and how the package stores responses, many text entries in a package will be stored compressed as binaries and not text.
That's the key piece here for Web Packages vs PWP...Web Packages are expected (in the vast majority of use cases) to be delivered over an HTTP connection, which is itself compressed.  Also, there isn't concern about storing these on devices or quota-based storage.  However, for PWP, we expect that delivery may take place via other means, will certainly be stored by a user somewhere with limited storage (a device, a cloud storage system, etc.)

Leonard


On 1/31/18, 1:42 AM, "Baldur Bjarnason" <baldur@rebus.foundation<mailto:baldur@rebus.foundation>> wrote:

   There are a few informative discussions on this in the Web Packaging repository:

   * "Switch to binary format and more." https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FWICG%2Fwebpackage%2Fissues%2F38&data=02%7C01%7Clrosenth%40adobe.com%7C60218998b49b452f1ab108d5681dbf35%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636529399344424810&sdata=1wx1hgZdmzjHC44Rs7UmB7KHq4wRlFE5SR6EuXI5270%3D&reserved=0

   * "Inclusion of binary data into a text-based format" https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FWICG%2Fwebpackage%2Fissues%2F10&data=02%7C01%7Clrosenth%40adobe.com%7C60218998b49b452f1ab108d5681dbf35%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636529399344424810&sdata=e8zQj1thB%2FIjI9P0unNlBKCSkQsqvsO6%2BKb9qjEu3uk%3D&reserved=0



   Cited reasons (as far as I can tell):

   * The TAG proposal proved to be more complex to implement than anticipated. Formats like CBOR or DER have pre-existing implementations and are used in other standards so browsers have to support them anyway.
   * A good portion of resources packaged are going to be binaries so a binary format would lead to considerable space savings over a text format

   Because of the ubiquity of compressed/gzipped HTTP responses and how the package stores responses, many text entries in a package will be stored compressed as binaries and not text.


   There’s also a discussion of whether to switch away from CBOR to DER for more secure parsing and better error handling:

   * "Consider switching to DER-encoded ASN.1" https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FWICG%2Fwebpackage%2Fissues%2F47&data=02%7C01%7Clrosenth%40adobe.com%7C60218998b49b452f1ab108d5681dbf35%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636529399344424810&sdata=oIbh43Bif3IYBODWtLF8Te5GIR9kQlfjx8jIthKS1AA%3D&reserved=0


   But based on that discussion it seems likely that they’ll stick to CBOR as that’s a simpler format.


   Also relevant:

   “Explain why we're not using ZIP” https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FWICG%2Fwebpackage%2Fissues%2F45&data=02%7C01%7Clrosenth%40adobe.com%7C60218998b49b452f1ab108d5681dbf35%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636529399344424810&sdata=I%2FiRBjO5FaN5wYYzuswCxHb%2FYXhaYL3N91RBW%2Be7gFk%3D&reserved=0


   - best
   - Baldur Bjarnason
     baldur@rebus.foundation<mailto:baldur@rebus.foundation>




On 30 Jan 2018, at 14:33, Ivan Herman <ivan@w3.org<mailto:ivan@w3.org>> wrote:

Romain,

that is true. But the question is: what is the advantage of using CBOR over simply transferring the original resource data (just like the original document of the TAG proposed)?

Ivan

---
Ivan Herman
Tel:+31 641044153
https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.ivan-herman.net&data=02%7C01%7Clrosenth%40adobe.com%7C60218998b49b452f1ab108d5681dbf35%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636529399344424810&sdata=0XR163h%2BWmpRHQ0ppjQPTwtLTwaU0aT5KZEZUy4pZvA%3D&reserved=0


(Written on mobile, sorry for brevity and misspellings...)



On 30 Jan 2018, at 20:04, Romain <rdeltour@gmail.com> wrote:


On 30 Jan 2018, at 19:11, Schindler Wolfgang Dr. <w.schindler@pons.de> wrote:


Am I right then that for a content document in HTML CBOR only means a 1:1 translation of UTF-8 codes into a binary format that would have exactly the same file size. If this is true, I’m afraid I don’t see (yet?) the connection to Web Packaging and the rationale for exchanging a human-readable format for a binary format. Or do I perhaps miss decisive goodies?

With CBOR and Jeffrey’s spec, you can *bundle* resources together and exchange them as one cohesive resource. Since a publication is a *collection* of multiple resources, we need a format to package them.

Romain.
Received on Wednesday, 31 January 2018 16:58:51 UTC