Re: C.10 Allow nondeterministic content models?
At 06:50 20/10/96 -0700, Joe English wrote:
>Tim Bray <firstname.lastname@example.org> wrote:
>> At 11:46 AM 18/10/96 -0700, Joe English wrote:
>> >In fact, the ambiguity rule can make things *easier* for implementors.)
>> Say what? The algorithms necessary to detect ambiguity are not typically
>> within the repertoire of the hypothetical CS bachelor's-degree type that
>> we'd like to be able to construct a validating parser.
>True: the ambiguity restriction makes it harder to validate DTDs.
If you don't have &-groups, detecting non-determinism (I think that's a much
less confusing term than ambiguity) is pretty trivial. In fact it you use
one obvious algorithm for checking instances, non-determinism checking can
be done in, I would guess, less than 10 lines of code. The obvious
algorithm I have in mind is a simplification of Algorithm 3.5 in the Dragon
Book (p140 in my copy), which is the algorithm for constructing a DFA from a
regexp. The simplification is that instead of the complicated final step
where you do the subset construction, you simply read off the DFA directly
from the followpos() sets. While you're doing that, all you have to do to
check non-determinism is to check each followpos set to see if it includes
two distinct positions labeled with the same symbol (ie generic identifier).