RE: Reporting my findings on Action 123 (http://www.w3.org/Fonts/WG/track/actions/open)

Folks,

I suspect that I have a bug in my code and that the evaluated number of bytes saved is incorrect (I didn't account for all the different cases where deltas can be equal to "0"). However, the number of points that can be eliminated is not going to be affected by it since I evaluated them using their actual x/y coordinates. So, for now please disregard the number of bytes saved.

The real question is whether eliminating predictable points will produce any meaningful savings *after* the entropy coding step is applied. Since all coordinates in the glyf table are expressed as deltas - I wonder how the entropy coder is taking care of them (and I suspect that it is quite good dealing with the deltas).

Thank you,
Vlad


> -----Original Message-----
> From: Levantovsky, Vladimir
> Sent: Tuesday, December 10, 2013 12:35 PM
> To: 'David Kuettel'
> Cc: public-webfonts-wg@w3.org
> Subject: RE: Reporting my findings on Action 123
> (http://www.w3.org/Fonts/WG/track/actions/open)
> 
> Hi all,
> 
> Like we discussed at the last telcon, I ran my experiment on a larger
> font set (MT web font corpus) - please see the results attached. The
> average percentage of points that can be predicted (and therefore,
> eliminated from a compressed font file) inched a bit higher to ~2%.
> Let's discuss this tomorrow during the telcon.
> 
> David, if you could please replace the preliminary results with these
> on your Google Drive, I'd really appreciate your help!
> 
> Thank you,
> Vlad
> 
> 
> > -----Original Message-----
> > From: David Kuettel [mailto:kuettel@google.com]
> > Sent: Wednesday, December 04, 2013 4:00 PM
> > To: Levantovsky, Vladimir
> > Cc: public-webfonts-wg@w3.org
> > Subject: Re: Reporting my findings on Action 123
> > (http://www.w3.org/Fonts/WG/track/actions/open)
> >
> > Thank you Vlad.  I forgot to double check the link.  My apologies.
> > Please try this use this link instead:
> >
> > Vlad's On-Curve Point Optimization Gains
> >
> https://docs.google.com/a/google.com/spreadsheet/ccc?key=0AvcH1ZzSrGMG
> > d
> > EFUMFlEUkFmQ0JCRmFTVGgyNEllRUE&usp=sharing#gid=0
> >
> > On Wed, Dec 4, 2013 at 11:57 AM, Levantovsky, Vladimir
> > <Vladimir.Levantovsky@monotype.com> wrote:
> > > Thank you David, the online spreadsheet is a nice tool, I keep
> > forgetting that we can share the data using ways that don't require
> an
> > installed application suite ;-) When I tried to access the file
> though
> > it said that the file doesn't exist (yet?) - can you please check
> into
> > that?
> > >
> > > Meanwhile, I've made a few changes to my toy project and extended
> > > the
> > collected dataset to count the exact number of bytes saved if we
> > eliminate the coordinates of predictable points. The slightly updated
> > spreadsheet is attached, as you can see each eliminated point
> consumes
> > on average 3.12 bytes.
> > >
> > > Talk to you all soon,
> > > Vlad
> > >
> > >
> > >> -----Original Message-----
> > >> From: David Kuettel [mailto:kuettel@google.com]
> > >> Sent: Wednesday, December 04, 2013 2:26 PM
> > >> To: Levantovsky, Vladimir
> > >> Cc: public-webfonts-wg@w3.org
> > >> Subject: Re: Reporting my findings on Action 123
> > >> (http://www.w3.org/Fonts/WG/track/actions/open)
> > >>
> > >> Fantastic, thank you Vlad!  Looking forward to discussing this in
> > the
> > >> working group meeting today.  To aid in the discussion, I created
> > >> an online spreadsheet along with a chart of the optimization
> gains.
> > >>
> > >> Vlad's On-Curve Point Optimization Gains
> > >> https://docs.google.com/spreadsheets/d/1PA9ssfAdWh2GKhhgStkw0-
> > >> yiiNAeG1zdfZqRzAVWaXM/edit?usp=sharing
> > >>
> > >> It would be fascinating to see the results of the experiment
> across
> > >> more font collections, esp. to see if any trends/patterns emerged.
> > >>
> > >> On Tue, Dec 3, 2013 at 2:40 PM, Levantovsky, Vladimir
> > >> <Vladimir.Levantovsky@monotype.com> wrote:
> > >> > Folks,
> > >> >
> > >> >
> > >> >
> > >> > <Rant>
> > >> >
> > >> > With the Thanksgiving holidays and all travel behind I came back
> > at
> > >> > the office to a backlog of over 500 emails in my Inbox. Some
> > >> > folks clearly don't like holidays and prefer to work overtime -
> I
> > figured
> > >> > that it may be a good day to forget about emails and just do
> > >> something else instead, like e.g.
> > >> > exploring on-curve point optimization. J
> > >> >
> > >> > </Rant>
> > >> >
> > >> >
> > >> >
> > >> > Here are the preliminary results (attached) - so far I ran the
> > test
> > >> > only on the fonts I have installed on my computer (without
> > >> prejudice).
> > >> > The numbers reported are:
> > >> >
> > >> > -          total number of all points for all contours defined
> in
> > a
> > >> 'glyf'
> > >> > table;
> > >> >
> > >> > -          number of on-curve points where their coordinates can
> > be
> > >> > predicted *precisely* by using the coordinates of two adjacent
> > >> > off-curve points (and, therefore, the actual coordinates can be
> > >> > eliminated from the pre-processed output by simply using one
> > >> > reserved bit in 'flags' field to mark the point as
> > >> > "predictable"), and
> > >> >
> > >> > -          percentage of points that can be predicted, per font.
> > >> >
> > >> >
> > >> >
> > >> > As you can see, while individual font results vary
> significantly,
> > >> > the average number of all points that can be predicted [with
> > >> > respective coordinates eliminated as redundant info] is about
> > >> > 1.42%. Considering that point coordinates may use either one- or
> > >> > two byte formats - the actual file size saving is likely to be
> > >> > somewhat smaller, my guess it would yield the savings of around
> > >> > 0.7-1% (this statement has not been evaluated by the FDA!)
> > >> >
> > >> >
> > >> >
> > >> > Let's discuss this over email and during the call tomorrow and
> > >> > see if there is a desire to do more about it.
> > >> >
> > >> >
> > >> >
> > >> > Cheers,
> > >> >
> > >> > Vlad
> > >> >
> > >> >

Received on Wednesday, 11 December 2013 16:47:06 UTC