Re: Binary Infosets

Hi Martin,

Martin Duerst wrote:
> At 08:12 02/10/14 +0200, Chris Lilley wrote:
>> 1) WAP-style (fixed, enumerated tokens, non-adapatable, impossible to add
>> another element or attribute or namespace) compression was not a good
>> idea as a way forward. Agreed.
> 
> Agreed in the long term. For what WAP wanted to do, it wasn't too bad.
> But it was just that compression turned out to not be necessary.

It also doesn't fly too well for SVG. One of the most common ways to 
extend SVG is to add a new element in another namespace and to somehow 
bind scripting to it. Within SVG, a WAP-style scheme is imho bound to be 
DOA (or to make SVG a lot less attractive to content creators). I know 
for instance that none of the SVG applications that I have produced over 
the past two years would work with that type of encoding.

>> MD> - Graphics is in some way different (e.g. a) the average Web page
>> MD>    size that makes sense on a mobile phone is much smaller than
>> MD>    the average amount of data needed for graphics (in particular
>> MD>    with animation),
>>
>> That aspect is true, yes.
> 
> Do we have any good grip on this, any kinds of numbers?

I guess numbers could be had if we had a definition of "make sense" ;) 
However I agree with this point. Chances are that a lot of the XHTML 
you'll get on a mobile will (or could be) low on markup, leaving only 
text. With graphic formats such as SVG, you necessarily have more markup 
(not just elements, but also attributes which are required for elements 
to be of any use). Markup naturally tends to produce overhead.

> Sorry, I was unclear here. What I wanted to say is that maybe
> it is easier (in the sense of higher compression rates) to compress
> SVG further into a binary format than to do the same with (X)HTML.
> I do not know whether this is true, but there could be various
> reasons for this:
> - More markup/styling for SVG than for (X)HTML, and markup/styling
>   is usually easier to compress than data.
> - Data (e.g. path data, which is mostly numbers plus a few letters)
>   has a higher redundancy.

Talking from my experience (which is pretty much limited to compressing 
PSVIs in this case) it is partially true and partially false.

Partially true because XHTML is likely to be more text than structure, 
and text only compresses as well as your usual compression technology. 
SVG on the other hand *tends* to have more structure, more markup, and 
locally specific structures (path data, transform lists, etc) that 
compress well using PSVI compression with specific type-based codecs. In 
some cases it is considered acceptable to apply lossy compression (eg 
quantise) to some graphics which can gain even more (the loss may not be 
visible on a mobile's screen).

It is also partially false because a well-designed profile of XHTML with 
not too many elements and less text than in the above case will compress 
better than an SVG document with dozens of optional presentation 
attributes (ie, lots of entropy) on each and every element.

In other words: it depends, but the chances are that in most cases SVG 
will be a better candidate for compression (especially if you have a 
lazy DOM based on a binary representation, instead of an SVG DOM that 
must have an object containing data for each point).

-- 
Robin Berjon <robin.berjon@expway.fr>
Research Engineer, Expway
7FC0 6F5F D864 EFB8 08CE  8E74 58E6 D5DB 4889 2488

Received on Wednesday, 16 October 2002 09:34:24 UTC