W3C home > Mailing lists > Public > xmlschema-dev@w3.org > June 2002

RE: Ambiguous content models -- allowed or disallowed by XSDL?

From: Lehmann, Steen <slehmann@silverstream.com>
Date: Thu, 13 Jun 2002 13:46:49 +0200
Message-ID: <FD8A740186525243807FBCDB7E739D0585F3A5@EXBE1.silverstream.be>
To: "'Jeni Tennison'" <jeni@jenitennison.com>
Cc: xmlschema-dev@w3.org

Hi Jeni,

There was indeed an error in my NFA, for some reason I defaulted the
maxOccurs on b to 'unbounded' instead of 1. Removing the transition from 3
to 3 corrects that, although there are many ways to specify an equivalent
graph using fewer transitions, as you've found out. Typically when
generating these graphs mechanically you tend to introduce more empty
transitions than strictly necessary because it makes the code simpler (I use
a recursive algorithm).

Since this process is very error-prone when done by hand I should have
verified the example I was trying to demonstrate. The DFA is actually a
little more complex:

0 --a--> 1
1 --a--> 2
1 --b--> 3
2 --b--> 4
2 --a--> 5
3 --a--> 6
4 --a--> 6
5 --b--> 7
5 --a--> 8
6 --b--> 7
8 --b--> 7

with accepting states 4, 5, 6, 7 and 8.

Cheers,

-- Steen

> -----Original Message-----
> From: Jeni Tennison [mailto:jeni@jenitennison.com]
> Sent: Thursday, June 13, 2002 1:08 PM
> To: Lehmann, Steen
> Cc: xmlschema-dev@w3.org
> Subject: Re: Ambiguous content models -- allowed or 
> disallowed by XSDL?
> 
> 
> Hi Steen,
> 
> Thanks for the explanation. I'm afraid I'm still a bit confused...
> 
> > The NFA created from the innards of the <sequence> above would have
> > five states with transitions between them, each transition would be
> > labelled a, b or (empty), meaning that when the parser is in a state
> > it will change to another state if there is a transition between the
> > two states matching the current symbol it is trying to parse (or if
> > there is a transition for (empty) between the two states). It may
> > look like this, numbering the states and labelling empty transitions
> > E. Computation starts in state 0 and is successful if it ends in
> > state 4.
> > 
> > 0 --a--> 1
> > 1 --a--> 2
> > 1 --b--> 3
> > 1 --E--> 3
> > 2 --b--> 3
> > 2 --E--> 4
> > 3 --b--> 3
> > 3 --E--> 4
> 
> That looks as though it would allow a, b, b, which isn't legal
> according to the schema (b was optional, with a maxOccurs of 1).
> Perhaps you meant it to be:
> 
> 0 --a--> 1
> 1 --a--> 2
> 1 --b--> 3
> 1 --E--> 3
> 2 --b--> 3
> 2 --E--> 3
> 
> or (more probably) I'm missing something about how the state
> transition works.
> 
> > Now, to turn this NFA into a corresponding DFA we "collapse" the
> > empty transitions from each state and copy transitions from states
> > we can reach by following empty transitions back to the original
> > state (known as evaluating lambda closure). The resulting DFA has no
> > empty transitions, so at each state we know immediately if a parsed
> > symbol is legal (there is a transition somewhere for that symbol)
> > without a need for backtracking.
> >
> > Here it is:
> > 0 --a--> 1
> > 1 --a--> 3
> > 1 --b--> 2
> > 2 --a--> 4
> > 2 --b--> 2
> > 3 --a--> 4
> > 3 --b--> 2
> > 4 --a--> 6
> > 4 --b--> 5
> > 5 --b--> 5
> > 6 --b--> 5
> >
> > The parsed input is correct if evaluation terminates in state 5 or
> > 6.
> 
> So in the case of the sequence a, a, b, you follow the path:
> 
>   0 --a--> 1 --a--> 3 --b--> 2
> 
> and since you don't get to state 5 or 6 it's invalid?
> 
> Cheers,
> 
> Jeni
> 
> ---
> Jeni Tennison
> http://www.jenitennison.com/
> 
Received on Thursday, 13 June 2002 07:47:27 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 11 January 2011 00:14:31 GMT