Re: Unscientific performance results

Norm Tovey-Walsh <norm@saxonica.com> writes:

> For your amusement, here are the results I got running Michael’s recent
> performance tests (the a-star tests) through my Earley and GLL
> parsers.

From the test names, I infer that you ran a test catalog that included
just the grammar tests.

> Cells marked “-” indicate that the test didn’t complete. The “% diff”
> column is an attempt to indicate how much faster the GLL parser is.
> Negative numbers indicate that Earley was faster by that percent.

What I notice here is that all of the cases where GLL is faster involve
grammars in the 'long comment' style, over a certain size.

Earley is almost always faster for whitespace-padded grammars and for
grammars in the 'short comment' style, which has mostly comments of
length four ({..}), with the occasional three (at the end of a line, so
one line of comments is 64 characters including the newline) or very
occasional five or longer.

Rearranging things by style (m vs s, lc vs sc vs ws);

> |------------+------------+-------------+-----------|
> | test       | Earley     | GLL         |    % diff |
> |------------+------------+-------------+-----------|
> | G.d3.m.lc  | 1.87s      | 0.59s       |    316.95 |
> | G.d4.m.lc  | 17m6.83s   | 1.28s       |  80221.09 |

> | G.b6.m.lc  | 0.91s      | 0.87s       |    104.60 |
> | G.b7.m.lc  | 4.20s      | 1.10s       |    381.82 |
> | G.b8.m.lc  | 19.20s     | 0.80s       |   2400.00 |
> | G.b9.m.lc  | 3m15.25s   | 1.00s       |  19525.00 |
> | G.b10.m.lc | 19m38.45s  | 1.38s       |  85394.93 |

So GLL is faster when there are multiple comments of length 124 or more,
Earley if they are 61 characters or shorter.

> | G.d3.s.lc  | 25.54s     | 0.58s       |   4403.45 |
> | G.d4.s.lc  | -          | 1.21s       |           |
> | G.b5.s.lc  | 2.01s      | 0.60s       |    334.83 |
> | G.b6.s.lc  | 7.35s      | 0.89s       |    825.84 |
> | G.b7.s.lc  | 53.02s     | 0.85s       |   6237.65 |
> | G.b8.s.lc  | 7m27.34s   | 0.76s       |  58860.53 |
> | G.b9.s.lc  | 1h1m59.03s | 0.93s       | 399895.70 |
> | G.b10.s.lc | -          | 1.34s       |           |

GLL is faster when there is a single long comment of length 309 or more,
Earley for single long comments of 149 or less.

> | G.d4.s.sc  | 15m57.37s  | 12m24.77s   |    128.55 |
> | G.b8.s.sc  | 21.49s     | 20.56s      |    104.52 |
> | G.b9.s.sc  | 2m29.46s   | 2m11.77s    |    113.42 |
> | G.b10.s.sc | 19m55.89s  | 14m20.23s   |    139.02 |

GLL is faster if there is long block of many short comments, with the
long block totaling 2550 bytes or more (including newlines not inside
the comments), Earley for many short comments with a total length of
1270 or lower.

> | G.b8.s.ws  | 15m10.14s  | 12m37.35s   |    120.17 |
> | G.b9.s.ws  | 2h20m8.87s | 1h48m20.07s |    129.37 |
> |------------+------------+-------------+-----------|

GLL is faster if the grammar is padded with a single large block
of whitespace of length 2550 or more; Earley is faster for shorter big
blocks of whitespace.

It looks to me as if there is something non-linear in the recognition of
long comments by Coffeepot's Earley parser.  If I understand correctly,
on deterministic grammars Earley should in principle be linear in cost
(to the size of the grammar plus the size of the input) -- it makes me
wonder whether there is something non-deterministic about the way the
ixml grammar defines comments.

I'd post the run times for Aparecium, embarrassing though they would be
by comparison, but after the run finally finished I discovered that my
test harness has developed a fault where the code for keeping track of
parse time ought to be, so all the parse times show up as 0.   I'll just
say I am confident that Aparecium has much worse problems with
non-linearity than Coffeepot.


-- 
C. M. Sperberg-McQueen
Black Mesa Technologies LLC
http://blackmesatech.com

Received on Wednesday, 13 July 2022 14:52:11 UTC