W3C home > Mailing lists > Public > www-xml-schema-comments@w3.org > July to September 2003

Re: UPA question

From: Henry S. Thompson <ht@cogsci.ed.ac.uk>
Date: Wed, 10 Sep 2003 10:16:22 +0100
To: Kohsuke Kawaguchi <Kohsuke.Kawaguchi@Sun.COM>
Cc: www-xml-schema-comments@w3.org, Paul Sandoz <Paul.Sandoz@Sun.COM>, Santiago Pericas-Geertsen <Santiago.Pericasgeertsen@Sun.COM>
Message-ID: <f5bhe3lm2ah.fsf@erasmus.inf.ed.ac.uk>

Kohsuke Kawaguchi <Kohsuke.Kawaguchi@Sun.COM> writes:

> But on the other hand, quoting appendix H:
>
>     A precise formulation of this constraint can also be offered in
>     terms of operations on finite-state automaton: transcribe the
>     content model into an automaton in the usual way using epsilon
>     transitions for optionality and unbounded maxOccurs, unfolding other
>     numeric occurrence ranges
>
> If you unfold the numeric occurrence range as suggested by the above
> paragraph, you get the following, which is clearly an UPA violation.
>
>     (s?,u,u?),(s?,u,u?)?

First note that the labels should be, per appendix H

      (s1?,u2,u2?),(s1?,u2,u2?)?

But Appendix H goes on to say "Determinize this automaton . . ., then
[erase the position indices] and check it's still deterministic.  If
not, UPA violation.  The determinized FSA is a bit complex, so I won't
try to draw it here, but it should be clear that since there are only
u2 and s1 labels, if it's deterministic _with_ the numbers, it will be
deterministic w/o them.

My XML Europe paper [1] [2] presents the Appendix H construction in much more
detail and, I hope, with much more clarity.

ht

[1] http://www.ltg.ed.ac.uk/~ht/XML_Europe_2003.html
[2] http://www.idealliance.org/papers/xmle03/slides/thompson/thompson1/Overview.html
-- 
  Henry S. Thompson, HCRC Language Technology Group, University of Edinburgh
                      Half-time member of W3C Team
     2 Buccleuch Place, Edinburgh EH8 9LW, SCOTLAND -- (44) 131 650-4440
	    Fax: (44) 131 650-4587, e-mail: ht@cogsci.ed.ac.uk
		     URL: http://www.ltg.ed.ac.uk/~ht/
 [mail really from me _always_ has this .sig -- mail without it is forged spam]
Received on Wednesday, 10 September 2003 05:16:24 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 23:09:00 UTC