- From: Steven Pemberton <steven.pemberton@cwi.nl>
- Date: Mon, 22 Aug 2022 12:45:38 +0000
- To: "Norm Tovey-Walsh" <norm@saxonica.com>, ixml <public-ixml@w3.org>
Interesting. The rules were only ever meant as hints, and certainly not normative. I certainly use a different set, and we changed them at least once, I think in order to make them so that any rule only used itself or earlier rules. But if people are adopting them as given, then we should either point out that other rules might work better for them, or indeed rewrite them again. Steven On Sunday 21 August 2022 19:04:07 (+02:00), Norm Tovey-Walsh wrote: > Hello folks, > > A Sunday evening observation for my fellow implementors before I wander > off in search of a cocktail. (I found rye whiskey at Tesco, oh frabjous > joy!) > > In the context of studying a grammar provided by a friend, Bethan and I > were discussing some aspects of the BNF that results from the rewrite > rules suggested in the specification. She observes, correctly, that many > of them result in “tails” of epsilons (that’s my paraphrase of her > somewhat more precise analysis). > > Matching “a” against “S=f+.f='a'.”, for example, winds up matching “a” > followed by “f-star”, f-star matches “f-option”, and “f-option” matches > ε. > > She proposed a different set of rules for f*, f+, and f**sep: > > f* ⇒ f-star > -f-star = f+ | (). > > f+ ⇒ f-plus > -f-plus = f | (f, f-plus). > > f**sep ⇒ f-star-sep > -f-star-sep = f++sep | (). > > That appears to be an equally valid set of rewrites and, whatever other > effects it may have, it’s clear that it will avoid the “tails”. > > It does, and my Earley parser with these rewrite rules runs, informally, > about twice as fast. > > Any further analysis will be the work of another day… > > Be seeing you, > norm > > -- > Norm Tovey-Walsh > Saxonica >
Received on Monday, 22 August 2022 12:45:56 UTC