Re: The rewrite rules matter

On 21/08/2022 18:04, Norm Tovey-Walsh wrote:
> f+ ⇒ f-plus
> -f-plus = f | (f, f-plus).
>
> f**sep ⇒ f-star-sep
> -f-star-sep = f++sep | ().
>
> That appears to be an equally valid set of rewrites and, whatever other
> effects it may have, it’s clear that it will avoid the “tails”.
>
> It does, and my Earley parser with these rewrite rules runs, informally,
> about twice as fast.

I've implemented the f+ and f++sep rewrites and seem, through the 
execution of the ixml test suite, to have almost a factor of 5x speed up 
with the Earley parser (1 minute vs 5 minutes and pretty consistent.) 
Will work on the f* rewrites and see what additional effect they have. I 
seem to have introduced some minor ambiguity ( appears in three of the 
ixml(ixml) tests), but the result XML trees are identical.

-- 
*John Lumley* MA PhD CEng FIEE
john@saxonica.com

Received on Wednesday, 24 August 2022 15:23:20 UTC