- From: Martin J. Duerst <duerst@w3.org>
- Date: Fri, 25 Aug 2000 12:49:36 +0900
- To: Dan Connolly <connolly@w3.org>, Yuichi Koike <koike@mmp.cl.nec.co.jp>
- Cc: xmlschema-dev@w3.org
At 00/08/24 11:44 -0500, Dan Connolly wrote:
>Yuichi Koike wrote:
> > Let me show the real example, which is a part of P3P schema file.
> > Is there any way to write a `right` schema, which is
> > compatible with the following example?
>
>In short, no, not exactly, but you can probably get
>close enough.
Well, for the example, actually yes.
> > <element name='DISPUTES'>
> > <complexType content='elementOnly'>
> > <element ref='p3p:EXTENSION' minOccurs='0' maxOccurs='unbounded'/>
> > <element ref='p3p:LONG-DESCRIPTION' minOccurs='0' maxOccurs='1'/>
> > <element ref='p3p:IMG' minOccurs='0' maxOccurs='1'/>
> > <element ref='p3p:REMEDIES' minOccurs='0' maxOccurs='1'/>
> > <element ref='p3p:EXTENSION' minOccurs='0' maxOccurs='unbounded'/>
> > </complexType>
> > </element>
The trick here is to split this up into various variants that are
all deterministic. The current thing is:
(extension*, long?, img?, remedies?, extension*)
In a first step, change this to
( (extension+, long?, img?, remedies?, extension*)
|(long?, img?, remedies?, extension*)
)
In the next step, we change it to:
( (extension+,( (long, img?, remedies?, extension*)
|(img?, remedies?, extension*)
)
)
|(long, img?, remedies?, extension*)
|(img?, remedies?, extension*)
)
Then we go to:
( (extension+,( (long, img?, remedies?, extension*)
|(img, remedies?, extension*)
|(remedies?, extension*)
)
)
|(long, img?, remedies?, extension*)
|(img?, remedies?, extension*)
)
And finally, we end up with:
( (extension+,( (long, img?, remedies?, extension*)
|(img, remedies?, extension*)
|(remedies, extension*)
)
)
|(long, img?, remedies?, extension*)
|(img, remedies?, extension*)
|(remedies, extension*)
)
which is clearly deterministic. It is possible to do similar
things for many if not most practical cases, but there are
cases where it doesn't work. Here is an example:
((a | b)*, a, (a | b))
[from some paper of A. Bruggemann-Klein, via Makoto Murata]
What it is saying is: a sequence of 'a' and 'b', at least two of
them, and the second-to-last has to be an 'a'. For a finite-size
deterministic automaton, there is no way to know it's at the
second-to-last.
Regards, Martin.
Received on Friday, 25 August 2000 01:34:45 UTC