Re: Reporting my findings on Action 123 (http://www.w3.org/Fonts/WG/track/actions/open) from David Kuettel on 2013-12-11 (public-webfonts-wg@w3.org from December 2013)

From: David Kuettel <kuettel@google.com>
Date: Wed, 11 Dec 2013 14:27:15 -0800
To: "Levantovsky, Vladimir" <Vladimir.Levantovsky@monotype.com>
Cc: "public-webfonts-wg@w3.org" <public-webfonts-wg@w3.org>
Message-ID: <CAAYUqgFsmafM-mexeLBhBsLuUq6=boe6gN=60puYEziWLzBBtQ@mail.gmail.com>
On Wed, Dec 11, 2013 at 8:46 AM, Levantovsky, Vladimir <
Vladimir.Levantovsky@monotype.com> wrote:

> Folks,
>
> I suspect that I have a bug in my code and that the evaluated number of
> bytes saved is incorrect (I didn't account for all the different cases
> where deltas can be equal to "0"). However, the number of points that can
> be eliminated is not going to be affected by it since I evaluated them
> using their actual x/y coordinates. So, for now please disregard the number
> of bytes saved.
>

Great catch Vlad!  That is a bummer though, the estimated byte savings were
significant for some of the fonts.  I have tentatively updated the online
spreadsheet accordingly (greying out the "Bytes saved" / "Bytes/point"
columns, for now, but can remove them completely).

>
> The real question is whether eliminating predictable points will produce
> any meaningful savings *after* the entropy coding step is applied. Since
> all coordinates in the glyf table are expressed as deltas - I wonder how
> the entropy coder is taking care of them (and I suspect that it is quite
> good dealing with the deltas).
>

Definitely.  Once the optimization has been added to the reference
compression tool (thank you again for volunteering to take this on
Jonathan), we can gather the post-Brotli numbers and then review them all
together.

>
> Thank you,
> Vlad
>
>
> > -----Original Message-----
> > From: Levantovsky, Vladimir
> > Sent: Tuesday, December 10, 2013 12:35 PM
> > To: 'David Kuettel'
> > Cc: public-webfonts-wg@w3.org
> > Subject: RE: Reporting my findings on Action 123
> > (http://www.w3.org/Fonts/WG/track/actions/open)
> >
> > Hi all,
> >
> > Like we discussed at the last telcon, I ran my experiment on a larger
> > font set (MT web font corpus) - please see the results attached. The
> > average percentage of points that can be predicted (and therefore,
> > eliminated from a compressed font file) inched a bit higher to ~2%.
> > Let's discuss this tomorrow during the telcon.
> >
> > David, if you could please replace the preliminary results with these
> > on your Google Drive, I'd really appreciate your help!
> >
> > Thank you,
> > Vlad
> >
> >
> > > -----Original Message-----
> > > From: David Kuettel [mailto:kuettel@google.com]
> > > Sent: Wednesday, December 04, 2013 4:00 PM
> > > To: Levantovsky, Vladimir
> > > Cc: public-webfonts-wg@w3.org
> > > Subject: Re: Reporting my findings on Action 123
> > > (http://www.w3.org/Fonts/WG/track/actions/open)
> > >
> > > Thank you Vlad.  I forgot to double check the link.  My apologies.
> > > Please try this use this link instead:
> > >
> > > Vlad's On-Curve Point Optimization Gains
> > >
> > https://docs.google.com/a/google.com/spreadsheet/ccc?key=0AvcH1ZzSrGMG
> > > d
> > > EFUMFlEUkFmQ0JCRmFTVGgyNEllRUE&usp=sharing#gid=0
> > >
> > > On Wed, Dec 4, 2013 at 11:57 AM, Levantovsky, Vladimir
> > > <Vladimir.Levantovsky@monotype.com> wrote:
> > > > Thank you David, the online spreadsheet is a nice tool, I keep
> > > forgetting that we can share the data using ways that don't require
> > an
> > > installed application suite ;-) When I tried to access the file
> > though
> > > it said that the file doesn't exist (yet?) - can you please check
> > into
> > > that?
> > > >
> > > > Meanwhile, I've made a few changes to my toy project and extended
> > > > the
> > > collected dataset to count the exact number of bytes saved if we
> > > eliminate the coordinates of predictable points. The slightly updated
> > > spreadsheet is attached, as you can see each eliminated point
> > consumes
> > > on average 3.12 bytes.
> > > >
> > > > Talk to you all soon,
> > > > Vlad
> > > >
> > > >
> > > >> -----Original Message-----
> > > >> From: David Kuettel [mailto:kuettel@google.com]
> > > >> Sent: Wednesday, December 04, 2013 2:26 PM
> > > >> To: Levantovsky, Vladimir
> > > >> Cc: public-webfonts-wg@w3.org
> > > >> Subject: Re: Reporting my findings on Action 123
> > > >> (http://www.w3.org/Fonts/WG/track/actions/open)
> > > >>
> > > >> Fantastic, thank you Vlad!  Looking forward to discussing this in
> > > the
> > > >> working group meeting today.  To aid in the discussion, I created
> > > >> an online spreadsheet along with a chart of the optimization
> > gains.
> > > >>
> > > >> Vlad's On-Curve Point Optimization Gains
> > > >> https://docs.google.com/spreadsheets/d/1PA9ssfAdWh2GKhhgStkw0-
> > > >> yiiNAeG1zdfZqRzAVWaXM/edit?usp=sharing
> > > >>
> > > >> It would be fascinating to see the results of the experiment
> > across
> > > >> more font collections, esp. to see if any trends/patterns emerged.
> > > >>
> > > >> On Tue, Dec 3, 2013 at 2:40 PM, Levantovsky, Vladimir
> > > >> <Vladimir.Levantovsky@monotype.com> wrote:
> > > >> > Folks,
> > > >> >
> > > >> >
> > > >> >
> > > >> > <Rant>
> > > >> >
> > > >> > With the Thanksgiving holidays and all travel behind I came back
> > > at
> > > >> > the office to a backlog of over 500 emails in my Inbox. Some
> > > >> > folks clearly don't like holidays and prefer to work overtime -
> > I
> > > figured
> > > >> > that it may be a good day to forget about emails and just do
> > > >> something else instead, like e.g.
> > > >> > exploring on-curve point optimization. J
> > > >> >
> > > >> > </Rant>
> > > >> >
> > > >> >
> > > >> >
> > > >> > Here are the preliminary results (attached) - so far I ran the
> > > test
> > > >> > only on the fonts I have installed on my computer (without
> > > >> prejudice).
> > > >> > The numbers reported are:
> > > >> >
> > > >> > -          total number of all points for all contours defined
> > in
> > > a
> > > >> 'glyf'
> > > >> > table;
> > > >> >
> > > >> > -          number of on-curve points where their coordinates can
> > > be
> > > >> > predicted *precisely* by using the coordinates of two adjacent
> > > >> > off-curve points (and, therefore, the actual coordinates can be
> > > >> > eliminated from the pre-processed output by simply using one
> > > >> > reserved bit in 'flags' field to mark the point as
> > > >> > "predictable"), and
> > > >> >
> > > >> > -          percentage of points that can be predicted, per font.
> > > >> >
> > > >> >
> > > >> >
> > > >> > As you can see, while individual font results vary
> > significantly,
> > > >> > the average number of all points that can be predicted [with
> > > >> > respective coordinates eliminated as redundant info] is about
> > > >> > 1.42%. Considering that point coordinates may use either one- or
> > > >> > two byte formats - the actual file size saving is likely to be
> > > >> > somewhat smaller, my guess it would yield the savings of around
> > > >> > 0.7-1% (this statement has not been evaluated by the FDA!)
> > > >> >
> > > >> >
> > > >> >
> > > >> > Let's discuss this over email and during the call tomorrow and
> > > >> > see if there is a desire to do more about it.
> > > >> >
> > > >> >
> > > >> >
> > > >> > Cheers,
> > > >> >
> > > >> > Vlad
> > > >> >
> > > >> >
>
Received on Wednesday, 11 December 2013 22:28:05 UTC