- From: C. M. Sperberg-McQueen <cmsmcq@blackmesatech.com>
- Date: Thu, 04 Aug 2022 07:23:41 -0600
- To: Steven Pemberton <steven.pemberton@cwi.nl>
- Cc: Norm Tovey-Walsh <norm@saxonica.com>, public-ixml@w3.org
Steven Pemberton <steven.pemberton@cwi.nl> writes: >> If the goal is to validate an input stream and determine whether it >> conforms to the Gedcom grammar or not, then failing to detect such >> errors is a flaw; if the goal is to parse Gedcom data and provide an >> accurate XML representation of it, then it may be a matter of >> indifference what the parser does with other input. If there is no >> requirement to raise an error on input that is not syntactically correct >> Gedcom data, then the failure to detect the 23/94 error is not a >> flaw. > This is similar to the grammars we use for dates, many of which will > accept 99 December 9999 with indifference. > It is a point I cover in my tutorial: whether you want to check and > convert, or just convert is a decision you have to make. Exactly so. I don't know whether this is a widespread experience or whether I am atypical in this, but I almost always find myself aiming, when starting a grammar, to accept all and only the good inputs, and I do not always find it easy to see ways to simplify a grammar by loosening it to accept a superset of the good inputs, or by tightening it to accept a subset of the good inputs. (This may be one reason I have so much trouble understanding the Elisp code for some major modes, or even understanding the documentation on how to write major modes for Emacs.) So I like to study cases like Steven's Gedcom example, where a judicious deviation from the all-and-only rule can make for a clear and simple grammar that can work in practical situations because its approximation of the target language is *close enough*. -- C. M. Sperberg-McQueen Black Mesa Technologies LLC http://blackmesatech.com
Received on Thursday, 4 August 2022 13:35:36 UTC