RE: EXI WG's inquiry about ISSUE-2050

Hi Robin, all,

After the couple of emails our groups exchanged, regarding the upcoming SVG path d representation, the EXI working group was keen to know how the planned changes are going to affect EXI representations. Hence, we did some analysis that we would like to share. The basic idea was to find out how good EXI works with the 'next' SVG version and what could be done to achieve an even better EXI compression.

To do so we elaborated and measured the following steps (see figures in excel sheet):

# SVG
SVG test-cases as exist today.

# EXI_SVG
Run EXI with svg data as exist today with schema knowledge:
 - qname strings are pre-populated only
 - no EXI datatype support. all values typed as string

# SVG2
Convert the path XML data to the form the SVG working group is heading towards, e.g,  <path d="M64.7,140.1" />  becoming  <path><movetoAbs><x>64.7</x><y>140.1</y></movetoAbs></path>

# EXI_SVG2
Apply EXI on SVG2 with typed schema knowledge for the path d elements.


There is further EXI improvement feasible:
 - instead of typing coordinates as float in some cases integer representation is possible
 - provide more accurate schema knowledge


The results confirm our expectation that EXI works well with partial schema information (EXI size being about 20% of original SVG size). Nonetheless, the more schema information are available the better. Tuning schema types (float vs. integer) as mentioned may be beneficial also.

We hope that the results are of interest for you as well.
Please don't hesitate to get back to us if you have any further questions,

-- Daniel




-----Original Message-----
From: Robin Berjon [mailto:robin@berjon.com] 
Sent: Monday, May 21, 2012 8:50 PM
To: liam@w3.org
Cc: Takuki Kamiya; SVG public list; member-exi-wg@w3.org
Subject: Re: EXI WG's inquiry about ISSUE-2050

On May 21, 2012, at 17:40 , Liam R E Quin wrote:
> On Mon, 2012-05-21 at 14:04 +0200, Robin Berjon wrote:
>> XSD couldn't capture context-dependent constraints.
> 
> XSD 1.1 has some support for this.

Right (though IIRC EXI doesn't make use of that?)

>> My experience with binarising SVG is that you gain most from custom 
>> codecs (or by changing the syntax, which is essentially the same) and 
>> less than you'd hope from the structural redundancy.
> 
> One difference (as of course you know) between EXI and some of the 
> others is that one could have e.g. a degooper that generated SAX 
> events, with nary a pointy bracket in sight.

Indeed, which can help produce a more regular SVG tree - but in this specific case that won't buy you a lot. The problem is that the content model for a lot of elements is choice(lots of options){0,*} and each of those elements has dozens of optional attributes - which will tend to be costly to encode even when schema-informed. Hence the suggestion to experiment with encoding the rarer ones as errors. I wouldn't be surprised if the result came out smaller in most cases.

> Overall, using EXI with serialized HTML 5 might also be a worth-while 
> goal, including embedded mathml and svg.

Yes, I would dearly love to see EXI applied to HTML.

--
Robin Berjon - http://berjon.com/ - @robinberjon

Received on Friday, 12 October 2012 11:37:26 UTC