Re: UPA question

Kohsuke Kawaguchi <Kohsuke.Kawaguchi@Sun.COM> writes:

> But on the other hand, quoting appendix H:
>
>     A precise formulation of this constraint can also be offered in
>     terms of operations on finite-state automaton: transcribe the
>     content model into an automaton in the usual way using epsilon
>     transitions for optionality and unbounded maxOccurs, unfolding other
>     numeric occurrence ranges
>
> If you unfold the numeric occurrence range as suggested by the above
> paragraph, you get the following, which is clearly an UPA violation.
>
>     (s?,u,u?),(s?,u,u?)?

First note that the labels should be, per appendix H

      (s1?,u2,u2?),(s1?,u2,u2?)?

But Appendix H goes on to say "Determinize this automaton . . ., then
[erase the position indices] and check it's still deterministic.  If
not, UPA violation.  The determinized FSA is a bit complex, so I won't
try to draw it here, but it should be clear that since there are only
u2 and s1 labels, if it's deterministic _with_ the numbers, it will be
deterministic w/o them.

My XML Europe paper [1] [2] presents the Appendix H construction in much more
detail and, I hope, with much more clarity.

ht

[1] http://www.ltg.ed.ac.uk/~ht/XML_Europe_2003.html
[2] http://www.idealliance.org/papers/xmle03/slides/thompson/thompson1/Overview.html
-- 
  Henry S. Thompson, HCRC Language Technology Group, University of Edinburgh
                      Half-time member of W3C Team
     2 Buccleuch Place, Edinburgh EH8 9LW, SCOTLAND -- (44) 131 650-4440
	    Fax: (44) 131 650-4587, e-mail: ht@cogsci.ed.ac.uk
		     URL: http://www.ltg.ed.ac.uk/~ht/
 [mail really from me _always_ has this .sig -- mail without it is forged spam]

Received on Wednesday, 10 September 2003 05:16:24 UTC