- From: John Lumley <john@saxonica.com>
- Date: Tue, 25 Oct 2022 15:50:50 +0100
- To: ixml <public-ixml@w3.org>
- Message-ID: <c3ea4cf2-1e4b-8a8e-f84b-80fd56ce624f@saxonica.com>
I had an action to report on the effects of the changed rewrite rules. I have some very rough preliminary findings. My test case is the iXML test suite (as of perhaps a month ago ;-) which has 234 test sets, 726 test cases, resulting in 693 parsing tests and 174 grammar tests. It turns out that my implementation as originally written, uses the following repetition rewrites: f+ ⇒ f-plus -f-plus: f, f*. which is in the original spec. and f* => f-star. -f-star: f, f-star|(). _which isn't._ The option rewrite and the separator repetition rewrites are as in the spec. On my Windows laptop with Chrome, running in a SaxonJS environment, the base figure with the rewrites as described above. are that the tests take a total of *5 mins 30 seconds*, and create *3.4E6 Earley items* in the parsing process. (Note this does not include Earley parsing of the iXML grammars, as my system has a hand-written parser for the iXML sources.) In this process, 24 of the test cases fail, exclusivley those (sample.grammar.*) of exceptional ambiguity and usually empty string inputs, i.e. not situations anticipated in practical use. When I switch on the alternative rewrite of*f+* to that proposed by Bethan Tovey-Walsh: f+ => f-plus -f-plus = f | (f, f-plus). then there is a marked difference in performance. The same tests now take *1 minute 1 second* and create a total of *2.7E6 Earley items* - the timing figures are pretty consistent.However, four additional test cases now fail - all involving ixml compiling itself (one line, no spaces, spaces variants) and whichresult in two ambiguous solutions being generated, whereas before there was a single. I'll have to look a little more closely at this and investigate further possible rewrites for the separator cases.... John -- *John Lumley* MA PhD CEng FIEE john@saxonica.com
Received on Tuesday, 25 October 2022 14:51:25 UTC