W3C home > Mailing lists > Public > public-exi@w3.org > November 2016

AW: Efficient representation for Web formats

From: Peintner, Daniel (ext) <daniel.peintner.ext@siemens.com>
Date: Wed, 16 Nov 2016 09:57:11 +0000
To: "Stephen D. Williams" <sdw@lig.net>, "public-exi@w3.org" <public-exi@w3.org>
Message-ID: <D94F68A44EB1954A91DE4AE9659C5A980FF6E5FB@DEFTHW99EH1MSX.ww902.siemens.net>
Hi Stephen,

Thanks for your feedback and insights.

I think at first we might want to consider the widely used formats. Next we may want to look at others.

Just by scanning "Delta" it seems it is JSON. We do have a solution for that that already but I can see the rational for improvements.

With regards to the other I have limited knowledge and we might want to take a look one by one.

What do you think?

Thanks,

-- Daniel



________________________________
Von: Stephen D. Williams [sdw@lig.net]
Gesendet: Mittwoch, 2. November 2016 14:24
An: public-exi@w3.org
Betreff: Re: Efficient representation for Web formats

Since we're considering wider application in the realm of web technologies, I'll relate a number areas that come to mind.  I've investigated these deeply for the last couple years and have been implementing apps and services using many of these.

I've recently noted a number of delta formats in use with Javascript + browsers.  Compressing these would be helpful, especially rapid-fire sequences of incremental updates.  One common need is to almost continually save/transmit/real-time receive while someone is typing or doing something, which is somewhat like the telnet / IM/chat / MMO problems.  This is a good example of a nice design, implementation, and related theory:

https://quilljs.com/docs/delta/
https://en.wikipedia.org/wiki/Operational_transformation


This may seem largely tangentially related, although direct support for graphs of objects, and updates / deltas for graphs of objects is an important area. I feel that thinking in graph API terms, and understanding the desirable app features (caching, only transmitting what is needed, infinite scrolling with just in time retrieval, scalable data / communications / users / UI) directly affects thinking about data formats and API / communication patterns.

For general data, graphs of JSON objects are very popular.  Pulling from older relational and REST ideas, avoiding a lot of dead ends, and learning from RDF/semantic web and other graph database and NoSQL systems and methods, there are two interesting conclusions:

1. Data is always a graph; better to explicitly model that in a clean and resilient way.  (What is the main thing that is completely absent in a Relational Database?  The relations!  Those are buried in code that does joins and similar.  One of the more ironic computer science instances.)

2. When you consider app logic required to build an efficient, low-latency, maintainable, and scalable app, a number of features and characteristics should be designed for directly in the data model, API, communication pattern, and application models on both front and back end.  Microservices and containerization reinforces that.  One important type of solution is graph APIs.  The leading projects there are Facebook's GraphQL and Netflix's Falcor.  The latter is easier to understand and seems easier to use, but there are some gaps and difficulties.  GraphQL has now gained a lot more momentum and broad support.  Somewhat related are the real-time, pub/sub-like standing queries such as Firebase, Horizon/RethinkDB, and being added by others (OrientDB etc.).  However, it seems easier to get the graph API idea by reading Falcor's description:

https://netflix.github.io/falcor/documentation/model.html

Completely different, yet sharing the same 'everything is one big graph' idea (and being very useful for it):
https://workflowy.com/


WebSockets can be assumed to be the communication link for a large range of latency sensitive applications.  Not much to do there as it was tough to overlay bidirectional async communications on the existing protocols, gateways, and other legacy components.  It's all deployed and baked in now.  Unfortunate that it couldn't be better, but it's not bad for most things.  It is worth keeping in mind as an efficient bidirectional communication link, which could make certain strategies seem more broadly feasible.  For instance, efficient interleaved async packets can allow for an extra negotiation round trip without much impact for some cases.


There are efforts toward solutions for at least the Emscripten / Asm.js subset of Javascript with WebAssembly.  There is a growing need to minimize parse and related startup times for Javascript while still not making many assumptions about implementation.  And without any hindrance to new language features; several important additions to core syntax were made not long ago.

https://brendaneich.com/2015/06/from-asm-js-to-webassembly/
https://www.w3.org/community/webassembly/

Handling possibly binary and structured data well is also needed.  This ought to be harmonized with supercomputing and emerging ML data formats.  While ML training won't be done often in web browsers, running the resulting models, which generally take far less computation and memory, is already somewhat widespread.

http://wiki.ecmascript.org/doku.php?id=harmony:typed_objects

A key subset of non-{image/video/code} binary data is 3D.  This is becoming much more important.  It is a crucial interest for my company and projects right now in support of simulation and game engine content.  Some work has been done, although it isn't necessarily optimal and it isn't implemented completely enough to be very / widely usable yet.

This is the leading and maybe principal effort right now, although we couldn't use it without a lot of completion work.  We have to use FBX, DAE (COLLADA), and OBJ.  Everything else is too slow, missing features, or not popular.:

https://github.com/KhronosGroup/glTF

Example usage:
http://cesiumjs.org/2015/08/10/introducing-3d-tiles/

A random sample of older and/or proprietary attempts to solve this:
http://openctm.sourceforge.net/
http://www.finalmesh.com/index.htm

ThreeJS is the leading base library for 3D on WebGL.

Stephen

On 11/2/16 2:16 AM, Peintner, Daniel (ext) wrote:

Hi,

As Taki recently discussed in [1] the focus of the EXI working group has changed a bit.

Besides the work on XML, the working group is also exploring the idea of applying EXI to other Web formats. The WG is making JSON exchange more efficient by applying EXI to JSON [2]. We are recently also exploring the idea of applying EXI to CSS [3] and JavaScript [4].

EXI is a general format that sends an efficient stream of events and can have noticeable, measurable savings in CPU, memory & bandwidth over formats such as minify and/or gzip that require the receiver to reconstitute the original JSON/CSS/JavaScript/... and parse it again.

We encourage you to take a look at our exploration of EXI for CSS [3] and EXI for JavaScript [4] and to provide feedback/comments.

Thank you,

Daniel (for the EXI WG)

[1] https://lists.w3.org/Archives/Public/public-exi/2016Oct/0004.html
[2] EXI for JSON, https://www.w3.org/TR/exi-for-json/
[3] EXI for CSS, https://github.com/EXIficient/exificient-for-css
[4] EXI for JavaScript, https://github.com/EXIficient/exificient-for-javascript/







--
Stephen D. Williams sdw@lig.net<mailto:sdw@lig.net> stephendwilliams@gmail.com<mailto:stephendwilliams@gmail.com> LinkedIn: http://sdw.st/in
V:650-450-UNIX (8649) V:866.SDW.UNIX V:703.371.9362 F:703.995.0407
AIM:sdw<thismessage:/> Skype:StephenDWilliams<thismessage:/> Yahoo:sdwlignet<thismessage:/> Resume: http://sdw.st/gres
Personal: http://sdw.st<http://sdw.st/> facebook.com/sdwlig twitter.com/scienteer
Received on Wednesday, 16 November 2016 09:57:56 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 16 November 2016 09:57:56 UTC