Re: The rewrite rules matter

A further thought - I think some of the speed gains in Norm’s Earley parser may be coming from the reduction in epsilon transitions (if I’ve understood him correctly). If so, the rewriting to 

f, f-star | ()

has the same problem as Norm’s original rewriting, in that it requires the parser to recognize the empty string as the final token of any starred sequence. Rewriting f* to 

f-plus | ()

avoids that.

So I’m guessing your rewrite will work well for Norm’s GLL parser, but may wipe out some of the speed gains he’s reported for the Earley parser. I’ll try to persuade him to experiment and report back!

BTW

Sent from my iPhone

> On 25 Aug 2022, at 22:06, Bethan Tovey-Walsh <accounts@bethan.wales> wrote:
> 
> Oh, okay. My bad. I guess I misremembered how the pipe works - I understood this as “f followed by either f-star or empty”, rather than  “either f followed by f-star, or empty”.  
> 
> Sent from my iPhone
> 
>>> On 25 Aug 2022, at 21:22, John Lumley <john@saxonica.com> wrote:
>>> 
>> Surely f-star will match an empty string, by its second alternative…
>> 
>> Sent from my iPad
>> 
>>>> On 25 Aug 2022, at 19:28, Bethan Tovey-Walsh <accounts@bethan.wales> wrote:
>>>> 
>>> 
>>>> 
>>>> f* ⇒ f-star
>>>> -f-star= f, f-star|().
>>> 
>>> Maybe I’m misunderstanding something, but that would seem to require that you match at least one f? So wouldn’t this give you a rewrite of f+, not f*?
>>> 
>>> BTW
>>> 
>>>>> On 25 Aug 2022, at 15:42, John Lumley <john@saxonica.com> wrote:
>>>>> 
>>>> 
>>>> On 21/08/2022 18:04, Norm Tovey-Walsh wrote:
>>>>> f* ⇒ f-star
>>>>> -f-star = f+ | ().
>>>> I think there's a cheaper possibility for rewriting f*, which is self-contained and avoids an f+ rewrite. I seem to have been using for some time, without perhaps realising it:
>>>> 
>>>> f* ⇒ f-star
>>>> -f-star= f, f-star|().
>>>> -- 
>>>> John Lumley MA PhD CEng FIEE
>>>> john@saxonica.com

Received on Friday, 26 August 2022 01:12:02 UTC