Re: Gedcom example

Steven Pemberton <steven.pemberton@cwi.nl> writes:

>> If the goal is to validate an input stream and determine whether it
>> conforms to the Gedcom grammar or not, then failing to detect such
>> errors is a flaw; if the goal is to parse Gedcom data and provide an
>> accurate XML representation of it, then it may be a matter of
>> indifference what the parser does with other input. If there is no
>> requirement to raise an error on input that is not syntactically correct
>> Gedcom data, then the failure to detect the 23/94 error is not a
>> flaw.

> This is similar to the grammars we use for dates, many of which will
> accept 99 December 9999 with indifference.

> It is a point I cover in my tutorial: whether you want to check and
> convert, or just convert is a decision you have to make.

Exactly so.

I don't know whether this is a widespread experience or whether I am
atypical in this, but I almost always find myself aiming, when starting
a grammar, to accept all and only the good inputs, and I do not always
find it easy to see ways to simplify a grammar by loosening it to accept
a superset of the good inputs, or by tightening it to accept a subset of
the good inputs.  (This may be one reason I have so much trouble
understanding the Elisp code for some major modes, or even understanding
the documentation on how to write major modes for Emacs.)

So I like to study cases like Steven's Gedcom example, where a judicious
deviation from the all-and-only rule can make for a clear and simple
grammar that can work in practical situations because its approximation
of the target language is *close enough*.

-- 
C. M. Sperberg-McQueen
Black Mesa Technologies LLC
http://blackmesatech.com

Received on Thursday, 4 August 2022 13:35:36 UTC