- From: Martin J. Duerst <duerst@w3.org>
- Date: Fri, 25 Aug 2000 12:49:36 +0900
- To: Dan Connolly <connolly@w3.org>, Yuichi Koike <koike@mmp.cl.nec.co.jp>
- Cc: xmlschema-dev@w3.org
At 00/08/24 11:44 -0500, Dan Connolly wrote: >Yuichi Koike wrote: > > Let me show the real example, which is a part of P3P schema file. > > Is there any way to write a `right` schema, which is > > compatible with the following example? > >In short, no, not exactly, but you can probably get >close enough. Well, for the example, actually yes. > > <element name='DISPUTES'> > > <complexType content='elementOnly'> > > <element ref='p3p:EXTENSION' minOccurs='0' maxOccurs='unbounded'/> > > <element ref='p3p:LONG-DESCRIPTION' minOccurs='0' maxOccurs='1'/> > > <element ref='p3p:IMG' minOccurs='0' maxOccurs='1'/> > > <element ref='p3p:REMEDIES' minOccurs='0' maxOccurs='1'/> > > <element ref='p3p:EXTENSION' minOccurs='0' maxOccurs='unbounded'/> > > </complexType> > > </element> The trick here is to split this up into various variants that are all deterministic. The current thing is: (extension*, long?, img?, remedies?, extension*) In a first step, change this to ( (extension+, long?, img?, remedies?, extension*) |(long?, img?, remedies?, extension*) ) In the next step, we change it to: ( (extension+,( (long, img?, remedies?, extension*) |(img?, remedies?, extension*) ) ) |(long, img?, remedies?, extension*) |(img?, remedies?, extension*) ) Then we go to: ( (extension+,( (long, img?, remedies?, extension*) |(img, remedies?, extension*) |(remedies?, extension*) ) ) |(long, img?, remedies?, extension*) |(img?, remedies?, extension*) ) And finally, we end up with: ( (extension+,( (long, img?, remedies?, extension*) |(img, remedies?, extension*) |(remedies, extension*) ) ) |(long, img?, remedies?, extension*) |(img, remedies?, extension*) |(remedies, extension*) ) which is clearly deterministic. It is possible to do similar things for many if not most practical cases, but there are cases where it doesn't work. Here is an example: ((a | b)*, a, (a | b)) [from some paper of A. Bruggemann-Klein, via Makoto Murata] What it is saying is: a sequence of 'a' and 'b', at least two of them, and the second-to-last has to be an 'a'. For a finite-size deterministic automaton, there is no way to know it's at the second-to-last. Regards, Martin.
Received on Friday, 25 August 2000 01:34:45 UTC