Re: Comparison of compression algorithms from Ted Guild on 2020-09-02 (public-automotive@w3.org from September 2020)

From: Ted Guild <ted@w3.org>
Date: Wed, 02 Sep 2020 08:53:05 -0400
To: Gunnar Andersson <gandersson@genivi.org>, Ulf Bjorkengren <ulfbjorkengren@geotab.com>, public-automotive <public-automotive@w3.org>
Message-ID: <cf81d2d0842e62a0e31ff24cc7c16aa61b2d72a1.camel@w3.org>

I also received a response regarding my comment made during the call
last week, initially and perhaps to hastily dismissing EXI for JSON
since it first bloats the message to more verbose XML. Apparently if a
schema is available it skips the XML conversion step, schema acting
similar to dictionary.

I had heard of an unspecified OEM, or whoever is their solution
provider, using EXI for off-boarding. I'll see if I can reopen an old
discussion with EXI folk. 

In my humble opinion, more options open up for increasing efficient
offboarding (higher compression/serialization which means fewer bytes)
if we cache sampled data and send in larger, buffered chunks.

On Tue, 2020-09-01 at 15:19 -0400, Ted Guild wrote:
> As mentioned, asking around and was encouraged by a colleague to look
> at JSON compression schemes.
> 
> 
https://www.lucidchart.com/techblog/2019/12/06/json-compression-alternative-binary-formats-and-compression-methods/
> 
> On Tue, 2020-09-01 at 15:10 -0400, Ted Guild wrote:
> > On Tue, 2020-09-01 at 11:28 -0400, Ted Guild wrote:
> > > We can/should also explore alternate formats Gunnar suggested
> > > 
> > > https://en.wikipedia.org/wiki/Apache_Avro
> > > https://en.wikipedia.org/wiki/Protocol_Buffers
> > > 
> > > As these messages being transmitted are fairly small to begin
> > > with
> > > and 
> > > in-vehicle use case will have extremely low latency, I also want
> > > to
> > > try
> > > to understand how/where this would be more useful as
> > > encoding/compressing and decoding will cost time and depending on
> > > method non-negligible cpu. If the client app is just sampling to
> > > offboard and won't unpack (decode), then we probably should look
> > > at
> > > Extended Vehicle and other solutions being used for off-boarding
> > > in
> > > addition to formats. What problem[s] are we trying to solve here?
> > 
> > We confirmed on the call today the primary use case is for off
> > boarding
> > data.
> > 
> > For a variety of reasons (security, connectivity, car being off),
> > data
> > being off-boarded from vehicle to the cloud will be pushed, not
> > pulled.
> > 
> > Gen2 server instance can reside either in-vehicle or on a server in
> > the
> > cloud. 
> > 
> > For in-vehicle client apps that will be residing on the vehicle or
> > nearby (local network) trusted devices, compression is not needed.
> > Cloud servers will not be permitted to connect to Gen2 (ports not
> > open)
> > on a vehicle to make pull requests. Gen2 supports clients making
> > pulls,
> > HTTP GET and subscribe on Web Socket, it cannot initiate a push to
> > server in the cloud.
> > 
> > Gen2 supporting alternate, including binary, formats besides JSON
> > for
> > more efficient local storage until it can send buffered data off
> > the
> > vehicle would help with 'every byte counts' when sending data off
> > the
> > vehicle.
> > 
> > VSS in protobuf format can make a ton of sense, perhaps with some
> > further optimization along lines of what Ulf and Sanjeev have been
> > working on.
> > 
> > Gen2 server residing in the cloud, exposing data already off-
> > boarded
> > can respond to pull requests from client apps running elsewhere in
> > the
> > cloud. Compression makes sense there.
> > 
> > If someone has a different architecture in mind where vehicle<-
> > >cloud
> > connections differ fundamentally or we can translate a pull to push
> > by
> > redirecting and caching/buffering data, I would like to hear it.
> > 
-- 
Ted Guild <ted@w3.org>
W3C Automotive Lead
https://www.w3.org/auto

Received on Wednesday, 2 September 2020 12:53:11 UTC