Re: [invisibleXML/ixml] Tweak existing Unicode tests, add a Unicode version test (PR #212)

ixample returns an empty result.

This turned out to be due to a power failure that left ixampl in a dubious 
state: it appeared to be running, but because of a file that should have 
been deleted but wasn't, was failing to include the results in the output.


That notwithstanding, the testcase also surfaced a bug in ixampl that ought 
to be a separate test case for Earley parsers: 


 If a number of alternatives all start with the same nonterminal at the 
same input position, only one of the alternatives actually processes the 
nonterminal, and then signals the other alternatives to restart if the 
nonterminal was successful.
 Because Earley treats alternatives in input-position order, normally all 
alternatives with the same leading nonterminal will have started by the 
time the nonterminal succeeds.
 However, if the leading nonterminal succeeds without consuming any input 
(i.e. it has an empty alternative), it can be the case that (some of) the 
other alternatives with that leading nonterminal have not yet started, and 
so miss the signal to restart, and thus hang for ever.


I have to admit, I have long stared at that bit of code, and wondered if 
there was an edge case that would fail, and promised myself I would analyse 
it at some point. Well, this test case forced me to do it.


Anyway, after changing the ixml to divert around this bug, ixampl returns 
15.0


So, my suggestion would be to factor out the s's so that this test only 
tests Unicode, and add another test for the bug.


By the way, I edited my earlier version of the Unicode version test in this 
way, but hadn't published it yet:


Unicode: version.


@version: v15; v14; v13; v12; pre-v12.


-v15: Lo, Lo, Lo, Lo, +"15".
-v14: Cn, Lo, Lo, Lo, +"14".
-v13: Cn, Cn, Lo, Lo, +"13".
-v12: Cn, Cn, Cn, Lo, +"12".
-pre-v12: Cn, Cn, Cn, Cn, +"pre 12".


-Lo: -[Lo], -#a.
-Cn: -[Cn], -#a.


Thus producing the output:


<Unicode version='15'/>


Steven

Received on Wednesday, 22 November 2023 12:16:13 UTC