Eliot's issues: & and determinism

At 08:35 PM 23/09/96 -0900, W. Eliot Kimber wrote:

BTW, I agree with all of EK's remarks, except as indicated

>>* Should XML forbid use of the '&' connector in content models
>>(11.2.4.1)?
>
>How much does AND complicate validation? I've seen some statements to the
>effect that it complicates it quite a bit, but I have no way to evaluate
>these claims.

You *can* turn an &-constraint into a regular expression, but you get
very fast (Steve said exponential, I'm not sure that's precisely true,
but it's fast) increase in the size of the regular expression. Supporting
this would have added significantly to the size of the MGML parser.  I'd
say lose it, it can be lived without.

>>* Should XML allow nondeterministic content models (11.2.4.3)?
>
>Again, how much does this complicate validation? I'm not ambiguity expert,
>but could the problem be solved simply by stipulating that a token is
>always matched to the first place in the content model it can match,
>without lookahead? 

Yes, it substantially complicates the parser.  It turns out, as Peter Sharpe
explained to me, you can do a clever trick while turning nDFA's into DFA's
to spot these; but it's not obvious to our target XML programmer I think.
Still, changing this would be a pretty egregious conflict with 8879.

Cheers, Tim Bray
tbray@textuality.com http://www.textuality.com/ +1-604-488-1167

Received on Tuesday, 24 September 1996 09:42:44 UTC