W3C home > Mailing lists > Public > xmlschema-dev@w3.org > June 2002

Re: Ambiguous content models -- allowed or disallowed by XSDL?

From: Jeni Tennison <jeni@jenitennison.com>
Date: Thu, 13 Jun 2002 12:07:52 +0100
Message-ID: <752146107299.20020613120752@jenitennison.com>
To: "Lehmann, Steen" <slehmann@silverstream.com>
CC: xmlschema-dev@w3.org

Hi Steen,

Thanks for the explanation. I'm afraid I'm still a bit confused...

> The NFA created from the innards of the <sequence> above would have
> five states with transitions between them, each transition would be
> labelled a, b or (empty), meaning that when the parser is in a state
> it will change to another state if there is a transition between the
> two states matching the current symbol it is trying to parse (or if
> there is a transition for (empty) between the two states). It may
> look like this, numbering the states and labelling empty transitions
> E. Computation starts in state 0 and is successful if it ends in
> state 4.
> 
> 0 --a--> 1
> 1 --a--> 2
> 1 --b--> 3
> 1 --E--> 3
> 2 --b--> 3
> 2 --E--> 4
> 3 --b--> 3
> 3 --E--> 4

That looks as though it would allow a, b, b, which isn't legal
according to the schema (b was optional, with a maxOccurs of 1).
Perhaps you meant it to be:

0 --a--> 1
1 --a--> 2
1 --b--> 3
1 --E--> 3
2 --b--> 3
2 --E--> 3

or (more probably) I'm missing something about how the state
transition works.

> Now, to turn this NFA into a corresponding DFA we "collapse" the
> empty transitions from each state and copy transitions from states
> we can reach by following empty transitions back to the original
> state (known as evaluating lambda closure). The resulting DFA has no
> empty transitions, so at each state we know immediately if a parsed
> symbol is legal (there is a transition somewhere for that symbol)
> without a need for backtracking.
>
> Here it is:
> 0 --a--> 1
> 1 --a--> 3
> 1 --b--> 2
> 2 --a--> 4
> 2 --b--> 2
> 3 --a--> 4
> 3 --b--> 2
> 4 --a--> 6
> 4 --b--> 5
> 5 --b--> 5
> 6 --b--> 5
>
> The parsed input is correct if evaluation terminates in state 5 or
> 6.

So in the case of the sequence a, a, b, you follow the path:

  0 --a--> 1 --a--> 3 --b--> 2

and since you don't get to state 5 or 6 it's invalid?

Cheers,

Jeni

---
Jeni Tennison
http://www.jenitennison.com/
Received on Thursday, 13 June 2002 07:07:54 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 11 January 2011 00:14:31 GMT