From: Jeni Tennison <jeni@jenitennison.com>

Date: Thu, 13 Jun 2002 12:07:52 +0100

Message-ID: <752146107299.20020613120752@jenitennison.com>

To: "Lehmann, Steen" <slehmann@silverstream.com>

CC: xmlschema-dev@w3.org

Date: Thu, 13 Jun 2002 12:07:52 +0100

Message-ID: <752146107299.20020613120752@jenitennison.com>

To: "Lehmann, Steen" <slehmann@silverstream.com>

CC: xmlschema-dev@w3.org

Hi Steen, Thanks for the explanation. I'm afraid I'm still a bit confused... > The NFA created from the innards of the <sequence> above would have > five states with transitions between them, each transition would be > labelled a, b or (empty), meaning that when the parser is in a state > it will change to another state if there is a transition between the > two states matching the current symbol it is trying to parse (or if > there is a transition for (empty) between the two states). It may > look like this, numbering the states and labelling empty transitions > E. Computation starts in state 0 and is successful if it ends in > state 4. > > 0 --a--> 1 > 1 --a--> 2 > 1 --b--> 3 > 1 --E--> 3 > 2 --b--> 3 > 2 --E--> 4 > 3 --b--> 3 > 3 --E--> 4 That looks as though it would allow a, b, b, which isn't legal according to the schema (b was optional, with a maxOccurs of 1). Perhaps you meant it to be: 0 --a--> 1 1 --a--> 2 1 --b--> 3 1 --E--> 3 2 --b--> 3 2 --E--> 3 or (more probably) I'm missing something about how the state transition works. > Now, to turn this NFA into a corresponding DFA we "collapse" the > empty transitions from each state and copy transitions from states > we can reach by following empty transitions back to the original > state (known as evaluating lambda closure). The resulting DFA has no > empty transitions, so at each state we know immediately if a parsed > symbol is legal (there is a transition somewhere for that symbol) > without a need for backtracking. > > Here it is: > 0 --a--> 1 > 1 --a--> 3 > 1 --b--> 2 > 2 --a--> 4 > 2 --b--> 2 > 3 --a--> 4 > 3 --b--> 2 > 4 --a--> 6 > 4 --b--> 5 > 5 --b--> 5 > 6 --b--> 5 > > The parsed input is correct if evaluation terminates in state 5 or > 6. So in the case of the sequence a, a, b, you follow the path: 0 --a--> 1 --a--> 3 --b--> 2 and since you don't get to state 5 or 6 it's invalid? Cheers, Jeni --- Jeni Tennison http://www.jenitennison.com/Received on Thursday, 13 June 2002 07:07:54 UTC

*
This archive was generated by hypermail 2.3.1
: Wednesday, 7 January 2015 14:55:57 UTC
*