- From: C. M. Sperberg-McQueen <cmsmcq@blackmesatech.com>
- Date: Sat, 23 Jul 2022 10:46:16 -0600
- To: M Joel Dubinko <micah@dubinko.info>
- Cc: public-ixml@w3.org
M Joel Dubinko <micah@dubinko.info> writes: > Appreciate the comments, Michael. > I should have mentioned, my focus at this early stage is building a > strong foundation that I can incrementally push forward until it > supports all of ixml. I’m mainly focused on getting core Earley > parsing correct, and setting up fundamental data structures. (And > grokking Rust!) Understood. I have similar hopes for a future Earley parser in another programming language. > I’ll respond to some of the comments below. But first, a general > question: I know about the ixml test suite [1], but is there anything > comparable for testing a bare Earley parser? (For example, given > grammar X and input Y, you should expect a trace Z as follows...) Not in the ixml test suite. I have not looked at the web more generally. My understanding, for what it's worth, is that Coffeepot and jωiXML both have ways of dumping their internal data structures (and Aparecium will have a run-time option for doing so rsn, though currently I make that happen by just adding an unparseable extra character to the input in order to make the parse fail, so that Aparecium dumpts its Earley items). So you can at least get something. Since I think both Coffeepot and jωiXML translate ixml into a more restricted BNF syntax internally, and Aparecium does not, there is likely to be some daylight among the results on a given test. > (And apologies if I missed it, but is there a zip or other way to > conveniently download the whole ixml suite? Not ready for it yet, but > still aiming to have something semi-presentable by Balisage) It's not a separate github repo, but if you download the zip for the Invisible XML ixml repo at https://github.com/invisiblexml/ixml the tests/ directory has what you need. > ... > Am I right to read this as indicating you plan to level the differences > between > > ['a'; 'b'; '0'-'9'] > > and > > ('a'; 'b'; '0'-'9') > There’s more structure to character “sets” than I’ve currently > sketched out. Both of these would have an identical effect on shaping > the allowed grammar, correct? Yes, unless I made a mistake constructing the example. -- C. M. Sperberg-McQueen Black Mesa Technologies LLC http://blackmesatech.com
Received on Saturday, 23 July 2022 16:59:17 UTC