- From: Roberto Chinnici <Roberto.Chinnici@Sun.COM>
- Date: Thu, 21 Jul 2005 09:09:25 -0400
- To: "Rogers, Tony" <Tony.Rogers@ca.com>
- Cc: David Orchard <dorchard@bea.com>, www-ws-desc@w3.org
Yes, actually after rereading the emails this morning I agree with you and I find my application of the algorithm I described faulty. Roberto Rogers, Tony wrote: > I don't think that follows. I do like the way you phrase the argument, > but I come to a different result :-) > > The processor accepts the first name. At that point it is looking either > for another name, or for a country, or for the end of the shipto. The > next tag is a country. So it accepts the country, and makes a > transition. Then it is looking for another country, or the end of the > shipto. It does not return to a state in which it will accept another > name, so it will ignore any more names. > > Additionally, I think it should treat the appearance of a name after a > country as a transition to a new state - one in which it is simply > looking for the end of the shipto. This is because a list of elements > cannot be interrupted by another element (I wasn't thinking clearly last > night :-) ). So I contend that the processor should accept exactly one > name and one country. Note that the new schema might accept a list of > alternating names and countries (that is describable, isn't it?), but > our schema does not. > > We must keep in mind that the processor is being expected to handle data > that conforms to a larger, but compatible, schema - we can expect that > that larger schema still abides by the rules of schema. > > As I said, I do like the way you phrased this. I think we can use it as > the basis of an algorithm that can describe what surgery is required on > the incoming data to make it acceptable. And it's determinisitic - > that's good. > > Tony > > -----Original Message----- > *From:* www-ws-desc-request@w3.org on behalf of Roberto Chinnici > *Sent:* Thu 21-Jul-05 11:19 > *To:* Rogers, Tony > *Cc:* David Orchard; www-ws-desc@w3.org > *Subject:* Re: LC124: Comment on V2S and [validity]=notKnown > > > Rogers, Tony wrote: > > One of the "interesting" aspects of the problem is that we must > solve is > > how we decide on the interpretation of ambiguous results. > > > > For example, it will be legal to take your example: > > > > > > <type name="shipto"> > > > > <sequence> > > > > <element ref="ad:name" minOccurs="1" maxOccurs="unbounded"/> > > > > <element ref="nad:country" minOccurs="0"/> > > > > </sequence> > > > > </type> > > > > > > > > (yes, I meant to change that to minOccurs) > > > > > > > > and feed it data like: > > > > > > > > <shipto> > > > > <ad:name>fred</ad:name> > > > > <nad:country>Australia</nad:country> > > > > <ad:name>bill</ad:name> > > > > </shipto> > > > > > > > > which can legitimately be interpreted (after ignorance has been > applied) as: > > > > > > > > <shipto> > > > > <ad:name>fred</ad:name> > > > > <ad:name>bill</ad:name> > > > > </shipto> > > > > > > > > OR > > > > > > > > <shipto> > > > > <ad:name>fred</ad:name> > > > > <nad:country>Australia</nad:country> > > > > </shipto> > > > > > > > > The latter is my expected interpretation (and may well be the > easier to > > program), but the former is legitimate (it takes the approach of > > grabbing as many ad:name elements as it can, and it still > satisfies the > > schema). > > > > > > > > What do other people think? > > I tend to go with the first interpretation. > > Here's how I'd define the "ignore unexpected" rule. This definition is > not phrased directly in terms of XML Schema, and I don't claim that it > would be trivial to do so, quite the contrary. Nevertheless, it seems > compatible with it; if anybody thinks otherwise, please point out where > I'm wrong. > > That scourge of all schema authors, the UPA rule, was introduced to > make sure the schema was determistic. I assume then that at any > given stage during the parsing of the contents of an element, the > set of start tages that can legally be encountered is determined > and each tag in that set is associated with exactly one transition > to a new state. (I believe we can safely ignore character content > for the purposes of our discussion.) > > Note that the set above, or better the set of names of all start tags > that can be encountered at any given state, may be infinite due to > the presence of a wildcard. This doesn't cause any problems -- all > we need is that the characteristic function of this set be computable. > Off the top of my head, I don't think that substitution groups would > be an issue either, they just make the construction of the set more > complex, nor would xsi:schemaLocation. > > Now, the "ignore unexpected" rule is defined as saying that if at > a given state the processor encounter a start tag for an element > whose name is not in the set of expected start tags for that state, > the element is discarded. Subsequently, the processor keeps > operating in the same state it was into (where would it transition > to otherwise?), as if the discarded element had never been there. > > Surely there are a few more tweaks that we need to do, like requiring > for some special treatment for the root element of a document and > dealing with attributes, but I hope that the definition I proposed > is clear enough. > > If we apply it to the example then, we obtain that > > <shipto> > <ad:name>fred</ad:name> > <nad:country>Australia</nad:country> > <ad:name>bill</ad:name> > </shipto> > > will be treated as > > <shipto> > <ad:name>fred</ad:name> > <ad:name>bill</ad:name> > </shipto> > > Let's look at a slightly more interesting example. > Assume the following schema: (note the maxOccurs="2") > > <type name="shipto"> > <sequence> > <element ref="ad:name" minOccurs="1" maxOccurs="2"/> > <element ref="nad:country" minOccurs="0"/> > </sequence> > </type> > > Then this document: > > <shipto> > <ad:name>fred</ad:name> > <nad:country>Australia</nad:country> > <ad:name>bill</ad:name> > <nad:country>New Zealand</nad:country> > <ad:name>jim</ad:name> > </shipto> > > will be treated as: > > <shipto> > <ad:name>fred</ad:name> > <ad:name>bill</ad:name> > </shipto> > > Thanks, > Roberto
Received on Thursday, 21 July 2005 13:10:08 UTC