Re: Comparison of compression algorithms from Magnus Feuer on 2020-09-01 (public-automotive@w3.org from September 2020)

From: Magnus Feuer <mfeuer1@jaguarlandrover.com>
Date: Tue, 1 Sep 2020 18:20:11 +0000
To: public-automotive <public-automotive@w3.org>
CC: Steven Martin <smarti24@jaguarlandrover.com>
Message-ID: <DB7PR04MB4697BC3055D65BB6E91832F7F92E0@DB7PR04MB4697.eurprd04.prod.outlook.com>
If payload size is a problem then we should definitely look at a binary setup such as protobuf, something which IMHO will always be more efficient than trying to compress small-payload text chunks.

That said, I think that both signals and RPCs will be more in the control plane than in the data plane, thus limiting the amount of data transmitted.

We still haven't solved data plane transmissions (video, dumps, audio, etc), possibly initiated via the control plane, but that plane needs to be highly optimized.

The only exception to the above that I can think of is if an OEM wants to stream real-time signals from a faulty vehicle, which may be needed in remote diagnostics, but that is a somewhat limited use case that will (hopefully) not be executed that often.

Regards,

/Magnus F.


-------------------
System Architect Manager
Jaguar Land Rover

Email: mfeuer1@jaguarlandrover.com<mailto:mfeuer1@jaguarlandrover.com>
Mobile: +1 949 294 7871

[https://ci3.googleusercontent.com/proxy/OfaGrHPlawsuQPtTYPlu2XkJRCrzJtHOGv2OSrFHsvJ6km-xYenAYwOsmmC-X18PrWn7LzA6AM--8oIU05Ifg6GD=s0-d-e1-ft#http://www.jaguarlandrover.com/email/jlr.jpg]

Jaguar Land Rover North America, LLC
1450 NW 18th Ave, Portland, OR 97209
-------------------
Business Details:
Jaguar Land Rover Limited
Registered Office: Abbey Road, Whitley, Coventry CV3 4LF
Registered in England No: 1672070

This e-mail and any attachments contain confidential information for a specific individual and purpose.  The information is private and privileged and intended solely for the use of the individual to whom it is addressed.  If you are not the intended recipient, please e-mail us immediately.  We apologise for any inconvenience caused but you are hereby notified that any disclosure, copying or distribution or the taking of any action in reliance on the information contained herein is strictly prohibited.

This e-mail does not constitute an order for goods or services unless accompanied by an official purchase order.



________________________________
From: Ted Guild
Sent: Tuesday, September 01, 2020 08:28
To: Gunnar Andersson; Ulf Bjorkengren; public-automotive
Subject: Re: Comparison of compression algorithms

On the call I wondered about compression options being used on
websockets

An outdated stackoverflow thread points to a few different extensions
to Web Sockets done or under discussion (at the time) over at IETF.
Some seem abandoned.

https://stackoverflow.com/questions/19298651/how-does-websocket-compress-messages

https://mailarchive.ietf.org/arch/msg/hybi/_dWnwQrfIu2xdSI1WQI5Sx6zfZY/

Looking at Web Socket implementations (client and server),
libwebsockets and Boost.beast have permessage-deflate

https://en.wikipedia.org/wiki/Comparison_of_WebSocket_implementations

Not sure and asking colleagues but doesn't seem permessage-bzip2 et al
that were under discussion went anywhere. Seeing several server gzipped
files (eg nginx) and able to gzip on fly which we already know eats cpu
and isn't as good on smaller responses.

We can/should also explore alternate formats Gunnar suggested

https://en.wikipedia.org/wiki/Apache_Avro
https://en.wikipedia.org/wiki/Protocol_Buffers

As these messages being transmitted are fairly small to begin with and
in-vehicle use case will have extremely low latency, I also want to try
to understand how/where this would be more useful as
encoding/compressing and decoding will cost time and depending on
method non-negligible cpu. If the client app is just sampling to
offboard and won't unpack (decode), then we probably should look at
Extended Vehicle and other solutions being used for off-boarding in
addition to formats. What problem[s] are we trying to solve here?

On Tue, 2020-09-01 at 12:18 +0000, Gunnar Andersson wrote:
> On Wed, 2020-08-26 at 11:25 +0200, Ulf Bjorkengren wrote:
> > I tried some online compression tools to see what kind of
> > compression
> > standard compression algorithms can achieve on a typical Gen2
> > response
> > payload, shown below.
> > The results show that they do not perform well on this type of
> > short
> > payloads, and cannot compete with a tailormade algorithm.
> > As a comparison, version two of the proprietary algorithm I
> > mentioned in
> > the presentation will compress the same payload to 17 bytes.
> > If there is interest for it, this algorithm will be implemented and
> > available on
> > https://github.com/MEAE-GOT/W3C_VehicleSignalInterfaceImpl
> > in both a Go impl and a JS impl.
> >
> > Payload:
> > {“action”:”get”, “timestamp”:”2020-08-25T13:37:00Z”, “value”:”123”,
> > “requestId”:”999”}
> > The above payload is 86 chars.
> >
> > http://www.txtwizard.net/compression
> > GZ: Execution time: 11875 us Compression ratio: 112 % Original
> > size: 118
> > bytes Result size: 105 bytes
>
> I noticed that here it says the original size is 118, but above it is
> 86.
> It doesn't change anything fundamental, just pointing it out.
>
> It could be because something happened when pasting into a web
> tool.  There
> might be some other encoding of the text going on.  I actually
> noticed when
> I copied the example from your HTML-formatted email, that the above
> was
> using left-and-right leaning " characters instead of a plain ASCII ",
> and I
> then get a 120 byte file with UTF-8 encoding, which suggests there
> were
> some multi-byte characters that snuck in.
>
> After fixing that and running gzip on the command line on pure ASCII,
> gzip
> causes an increase of the size from 86 to 100 bytes.
>
> But none of those details really matter and the results are expected
> behavior on very small files - we already agreed to that yesterday.
>
> I don't want to waste time on the exact number of bytes in that
> comparison.
> It is well known that much better compression is possible with any
> method
> where a predefined dictionary exists than with a general-purpose
> compression that is not allowed to agree on a lookup dictionary
> beforehand
> (which you have done for the keywords, and kind of indirectly also
> for the
> shortened UUID).
>
> One thing that might still be useful, just to see if sticking to
> plain
> HTTP-supported compression is an option (which would be A LOT
> easier), is
> to perform a comparison of a large response to a large query.  I'm
> thinking
> that is where compression is also most important?
>
> If the goal is to truly minimize transfers, both large and small,
> then I
> wonder why the approach is not to use a full binary encoding
> instead.  (Or
> if there are other goals, let's discuss them?).  As you probably
> understand
> I am sceptical to the idea of creating a custom "compression
> algorithm"
> without clarifying what the point of that is.
>
> It's might be a matter of definition, but I already tend to think
> about
> what you created as an /alternative encoding/ more than compression,
> and
> I'm thinking that a mind-shift towards that term might uncover even
> better
> alternatives?
>
> A super-tailored encoding for any task could be made optimal, but
> building
> on standards is worthwhile.
>
> It doesn't hurt to use a formal language to describe the message
> schema
> anyway.  (The Gen2- specification might ought to do that for the
> original
> JSON too?).  And as I mentioned on our previous call, Avro and
> Protobuf has
> languages to describe such schemas.  If you then use the associated
> tooling
> then the resulting binary encoding could be studied.  There are
> /MANY/
> other options out there, many of which were also discussed in a
> previous
> GENIVI project named "Generic Communication Protocols
> Evaluation".   I'm
> sure this is not a new discussion in W3C either and I may have heard
> Ted
> mention that also.
>
> Sincerely,
> - Gunnar
>
>
>
>
>
>
>
>
--
Ted Guild <ted@w3.org>
W3C Automotive Lead
https://www.w3.org/auto
Received on Tuesday, 1 September 2020 18:20:30 UTC