W3C home > Mailing lists > Public > www-xml-schema-comments@w3.org > January to March 2001

RE: About those "non-deterministic content model" errors

From: Fuchs, Matthew <matthew.fuchs@commerceone.com>
Date: Thu, 11 Jan 2001 11:28:12 -0800
Message-ID: <4C4A7BE77CE1D311A1D200508BA38C1202F3538D@venus.commerceone.com>
To: "'Slein, Judith A'" <JSlein@crt.xerox.com>, "'www-xml-schema-comments@w3.org'" <www-xml-schema-comments@w3.org>
Cc: "Sembower, Neil R" <NSembower@crt.xerox.com>
Judith,

This is the problem with trying to hack a solution when a perfectly good
mechanism is being ignored :-(

If you take CD out of the jdf:Resource substitution group then there is no
problem.  In your application, you match on element _type_ not element
_name_.  Alternatives are less good - you locally munge the JDF schema to
remove the <any> (which would then validate only your extensions, but no one
elses) or you see how we decide one of our CR issues which could allow you
to better control which namespaces <any> matches (but then you locally munge
the <any>).  I don't like local munging.

Hope this helps,

Matthew

> -----Original Message-----
> From: Slein, Judith A [mailto:JSlein@crt.xerox.com]
> Sent: Thursday, January 11, 2001 10:17 AM
> To: Fuchs, Matthew; Slein, Judith A; 'www-xml-schema-comments@w3.org'
> Cc: Sembower, Neil R
> Subject: RE: About those "non-deterministic content model" errors
> 
> 
> Thanks again for a very interesting discussion.
> 
> If I understand the second option you describe, our schemas 
> and instance
> documents should work today. <any> appears only in the base 
> JDF schema, not
> in our extensions.  The extension schema contains new complexType
> definitions derived from the complextType jdf:Resource, and 
> elements of
> those types with substitutionGroup="jdf:Resource".
> 
> That is, the JDF schema has:
> 
> <element name="Resource" type="jdf:Resource" abstract="true"/>
> <complexType name="Resource" abstract="true"/>
> ...
> </complexType>
> ...other concrete Resource type / element definitions...
> <element name="ResourcePool" type="jdf:ResourcePool"/>
> <complexType name="ResourcePool">
>     <complexContent>
>        <extension base="jdf:GenericContent">
>           <choice minOccurs="0" maxOccurs="unbounded">
>              <element ref="jdf:Resource"/>
>              <any namespace="##other" processContents="lax"/>
>           </choice>
>        </extension>
>     </complexContent>
> </complexType>
> 
> Then our extension schema defines:
> 
> <element name="CD" type="myns:CD" substitutionGroup="jdf:Resource"/>
> <complexType name="CD">
>    <complexContent>
>       <extension base="jdf:Resource">
>       ...
>       </extension>
>    </complexContent>
> </complexType>
> 
> Then the instance document has:
> 
> <jdf:ResourcePool>
>    <jdf:Component .../>
>    <myns:CD .../>
>    <jdf:GatheringParams .../>
> </jdf:ResourcePool>
> 
> Perhaps there is some disagreement between you and Henry 
> Thompson about what
> the XML Schema spec requires.  Henry believes that since we 
> are inside a
> <choice>, and there is no way to determine which of the 
> branches to use for
> validation of <myns:CD>, the spec requires him to raise a 
> non-deterministic
> content model schema error.
> 
> --Judy
> 
> -----Original Message-----
> From: Fuchs, Matthew [mailto:matthew.fuchs@commerceone.com]
> Sent: Wednesday, January 10, 2001 6:56 PM
> To: 'Slein, Judith A'; 'www-xml-schema-comments@w3.org'
> Cc: Sembower, Neil R
> Subject: RE: About those "non-deterministic content model" errors
> 
> 
> Judith,
> 
> This is a marvelously interesting problem, with two important, but
> conflicting goals -
> 
> 1) Minimize information needs for validating a portion of a document.
> 2) Maintain type safety and allow controlled evolution.
> 
> This is similar to applications which use a validating parser 
> when building
> an application, but then turn it off at run time, as only 
> valid documents
> should occur, so why encur the cost of validation.  This 
> question arose in
> the early days of the WG, and at that time I suggested there 
> is a validation
> spectrum, not just valid or well-formed.
> 
> I think the slickest answer to this problem would be a place on that
> validation spectrum called something like "parse-as-base", 
> where xsi:type
> would simply be ignored, and unexpected elements at the end 
> of a content
> model would be assumed to come from the ignored derived type. 
>  Then parties
> could use type extension ad nauseum (but not substitution 
> groups), but all
> these private extensions would be ignored by a parse in 
> "parse-as-base"
> mode.  An unextended JDF system would implicitly view the tail of an
> extendable type as allowing any content.  There could also be a
> "parse-as-known" mode, closer to lax, which pays attention to 
> xsi:type if it
> recognizes the type.  This would allow known extensions to be 
> automatically
> handled, and unknown ones to be ignored.  (Note that if I 
> ignore the value
> of xsi:type, then I don't need to load its definition to 
> parse the element
> it occurs on.)  Substitution groups could not be used here 
> because whenever
> there is a <choice> and an unknown element shows up, it would 
> be necessary
> to get the definition of the element to discover for which element it
> substitutes.
> 
> Second most elegant solution would be to exercise the 
> (intentional) loophole
> in XSDL for locating schemas.  Since it is up to the 
> validator, really, to
> decide what schema to use for parsing a document, the JDF 
> structure can
> easily finesse the use of extension you are trying to do.  
> Their base schema
> (and only their base schema) would include <any namespace="##other"
> processContents="lax" minOccurs="0"/> at the appropriate places (your
> problem was from putting an <any> in your extension).  The 
> instances would
> replace the any with some element and use schemaLocation to 
> point to the
> definitions.  Default processors could simply ignore these 
> hints and process
> the new elements laxly.  "Aware" processors examine the hints 
> and therefore
> validate the known elements.  They get to validate without loading any
> additional schema info, you get to validate with all 
> additional schema info,
> or somewhere inbetween.
> 
> To a certain extent, both of these do the same thing, except the first
> allows you to use the refinement mechanism as was intended, 
> while the second
> is ad hoc.
> 
> The problem with that kind of private extension mechanism is 
> that it makes
> applications harder to write because the element type isn't the _real_
> element type - it's some type infered by looking at private 
> extensions, so
> extra code needs to be written and maintained to find that 
> information,
> while using the extension mechanism provides it automatically.
> 
> I think either of the solutions above solves the problem.  
> The second, while
> technically less interesting, requires no new action on the 
> part of the
> Schema WG.
> 
> Matthew
> 
> > -----Original Message-----
> > From: Slein, Judith A [mailto:JSlein@crt.xerox.com]
> > Sent: Wednesday, January 10, 2001 7:08 AM
> > To: Fuchs, Matthew; Slein, Judith A; 
> 'www-xml-schema-comments@w3.org'
> > Cc: Sembower, Neil R
> > Subject: RE: About those "non-deterministic content model" errors
> > 
> > 
> > Thanks for taking the time to look into this issue.  More 
> below . . .
> > 
> > -----Original Message-----
> > From: Fuchs, Matthew [mailto:matthew.fuchs@commerceone.com]
> > Sent: Tuesday, January 09, 2001 4:30 PM
> > To: 'Slein, Judith A'; 'www-xml-schema-comments@w3.org'
> > Subject: RE: About those "non-deterministic content model" errors
> > 
> > 
> > A cursory glance through section 3.10 leads me to a two 
> part response:
> > 
> > 1) I'm not sure the authors understand the XSDL extension 
> > mechanism.  In any
> > case, I'd love a more in-depth discussion than "this 
> mechanism is not
> > allowed to be applied to any elements defined by JDF 
> because  such new
> > element types can only be understood by the agent that made 
> > the extensions"
> > given the amount of work done to make sure that that 
> > statement is false.
> > That you use extension in your example is especially confusing.
> > 
> > <JS>I had some e-mail conversations with the JDF spec authors 
> > last fall in
> > which I tried to persuade them to encourage the use of 
> > derived types for
> > extensions.  Here's the most concise statement of their 
> > rationale that came
> > out of the discussion:
> > 
> > They don't want people to derive types from any JDF types in other
> > namespaces because:
> > <MM>If we would loosen this constraint the problem raises that then
> > everybody who reads your JDFs must know what types you derived from
> > Component in order to interpret the JDF-standard Component.  
> > If you allow
> > only extensions by private namespaces, you don't have to know 
> > something
> > about your extensions in order to understand the standard.
> > You are allowed to include your private sub-elements and 
> > attributes in the
> > Component which leads to the same result has achieved by derivation.
> > 
> > Derivation is very fine in object oriented sense. My opinion 
> > is that the
> > power of derivation comes up at first if you introduce 
> > methods.  Methods are
> > beyond our JDF parameter files.  Derivation of data types for 
> > extension is
> > nothing else than adding additional parameters.  The use of 
> > derivation for
> > making abstract parameters instanciable (pendant to virtual 
> functions)
> > should not be allowed due to the reasons mentioned above.</MM>
> > 
> > <MM>
> > We want to be able to validate JDF-files by using only the 
> > JDF-Schema.  We
> > don't want look for other schema files for validation.  If we 
> > would allow
> > everybody to use private extensions together with schemas, it 
> > may happen
> > that we need a lot of schemas in order to validate a JDF.  
> > Then, if only one
> > schema is missing the validation would fail.  Would you accept this
> > statement ?
> > </MM>
> > 
> > </JS>
> > 
> > 2) The use of "any" is not the best way to get what you want, 
> > anyway.  One
> > possibility is the following:
> > 
> > <element name="PrivateExtension" type="xsd:AnyType"/>
> > <element name="ResourcePool" type="jdf:ResourcePool"/>
> > <complexType name="ResourcePool">
> >    <complexContent>
> >       <extension base="jdf:GenericContent">
> >          <choice minOccurs="0" maxOccurs="unbounded">
> >             <element ref="jdf:Resource"/>
> >             <!-- Extension resources are allowed.  _They_can
> > _now_be_JDF_Resources_ 
> >                  They can, in fact, be any declared type from 
> > any schema,
> > whether derived or not -->
> >             <element ref="PrivateExtension" minOccurs="0"/>
> >          </choice>
> >       </extension>
> >    </complexContent>
> > </complexType>
> > 
> > An optional PrivateExtension should appear wherever 
> > extensions are allowed,
> > per JDF.  This ensures, as they want, that "private 
> > extensions" can appear
> > in any of these places.  It is also clearly marked as a 
> > private extension -
> > and either refinement or substitution groups can be used to 
> > distinguish the
> > desired content without any concern for nondeterministic 
> > content models.  I
> > would consider this the best solution that doesn't allow you 
> > to extend the
> > base JDF types.  It solves their "issue" (no use of 
> > extensions for base JDF
> > types) while maintaining type safety.
> > 
> > If they're really dead set against extensions, then you can 
> > give them a
> > PrivateExtension type which explicitly allows <any> content.
> > 
> > <JS>I hadn't thought about this approach.  But I think it is 
> > not a solution
> > for the case at hand, because JDF wants the instance document 
> > to look like
> > this (I think):
> > 
> > <jdf:ResourcePool>
> >    <jdf:Component . . .>. . .</jdf:Component>
> >    <myns:CD . . .>. . .</myns:CD>
> >    <myns:DVD . . .>. . .</myns:DVD>
> >    <jdf:GatheringParams . . ./>
> >    <jdf:PackingParams . . ./>
> > </jdf:ResourcePool>
> > 
> > </JS>
> > 
> > Matthew
> > 
> > > -----Original Message-----
> > > From: Slein, Judith A [mailto:JSlein@crt.xerox.com]
> > > Sent: Tuesday, January 09, 2001 12:28 PM
> > > To: Fuchs, Matthew; Slein, Judith A; 
> > 'www-xml-schema-comments@w3.org'
> > > Subject: RE: About those "non-deterministic content model" errors
> > > 
> > > 
> > > The JDF specification is at 
> > http://www.job-definition-format.org.  The
> > > discussion of extensions is in section 3.10.  The notion 
> > > "private extension"
> > > is not defined.
> > > 
> > > Basically, the JDF spec discourages the use of derived types 
> > > for extensions.
> > > This leads to my assumption that it will use "any" at 
> > > extension points in
> > > its XML Schema.  But using "any" at these points just pushes 
> > > the task of
> > > enforcement of semantics into JDF-specific code -- the intent 
> > > is still that
> > > the elements that can appear at these points must be of the 
> > > "right sort".
> > > In the example, they must have the same structure as if 
> > they had been
> > > derived from the complexType jdf:Resource, even though 
> > > actually deriving
> > > them is discouraged.
> > > 
> > > Since you can get so much more powerful enforcement of 
> > > constraints by using
> > > derived types, we want to go ahead and do so (in our own 
> > > namespace).  But
> > > for now, the XML Schema spec is preventing us from doing 
> > so, since the
> > > schema processor then cannot determine whether it should use the
> > > jdf:Resource branch or the "any" branch in the choice to 
> > validate our
> > > extension element.  We want the jdf:Resource branch to be used.
> > > 
> > > Anyhow, I don't think this problem will be unique to JDF.  
> > > Just imagine any
> > > case where I am faced with an XML Schema written by somebody 
> > > else, which I
> > > want to extend.  It has in it the construct in the example. 
> >  Now I can
> > > define an element and its type from scratch, independent of 
> > > the definition
> > > of jdf:Resource.  In that case they will be validated using 
> > > the "any" branch
> > > of the choice.  But this is a much weaker form of validation 
> > > than I could
> > > get if there were a rule in XML Schema that in this sort of 
> > > case, a schema
> > > processor should validate using the more restrictive branch 
> > > if possibe, and
> > > only use the "any" branch as a last resort.  Then I could 
> > > derive a type from
> > > the jdf:Resource complexType, and create an element with
> > > substitutionGroup="jdf:Resource", and it would be validated 
> > > to be sure it is
> > > in fact of type jdf:Resource.
> > > 
> > > --Judy
> > > 
> > > -----Original Message-----
> > > From: Fuchs, Matthew [mailto:matthew.fuchs@commerceone.com]
> > > Sent: Tuesday, January 09, 2001 2:49 PM
> > > To: 'Slein, Judith A'; 'www-xml-schema-comments@w3.org'
> > > Subject: RE: About those "non-deterministic content model" errors
> > > 
> > > 
> > > I do not understand the "co-constraint" between your policy 
> > regarding
> > > extensions and your use of any.  What is a "private 
> > > extension".  Can you
> > > give a pointer to the JDF spec to explain these things?  I 
> > > can't tell if
> > > what you're trying to do is "reasonable" until I can 
> > > understand the goal.
> > > 
> > > Thanks,
> > > 
> > > Matthew Fuchs
> > > 
> > > > -----Original Message-----
> > > > From: Slein, Judith A [mailto:JSlein@crt.xerox.com]
> > > > Sent: Tuesday, January 09, 2001 8:34 AM
> > > > To: 'www-xml-schema-comments@w3.org'
> > > > Subject: FW: About those "non-deterministic content 
> model" errors
> > > > 
> > > > 
> > > > When I sent this note to Henry Thompson, he suggested that I 
> > > > send a comment
> > > > to the XML Schema working group.  So here is the example I 
> > > > sent him, and his
> > > > response:
> > > > 
> > > > <HT>I think I understand the design goal and agree it's 
> > > > reasonable.  The
> > > > problem is not with XSV, but with the XML Schema spec. 
> > > itself.  Given
> > > > the example, the relevant element in the instance could be
> > > > accepted by either branch of the <choice>, and that's 
> not allowed.
> > > > 
> > > > Please send this example to www-xml-schema-comments and 
> > > > prefix it with 
> > > > the observation that without a version of 'any' which 
> > > explicitly does 
> > > > _not_ validate anything which would cause a violation of 
> > the unique
> > > > attribution restriction you can't do what you (quite 
> > > reasonably) want
> > > > to do.</HT>
> > > > 
> > > > -----Original Message-----
> > > > From: Slein, Judith A 
> > > > Sent: Monday, January 08, 2001 9:47 AM
> > > > To: 'ht@cogsci.ed.ac.uk'
> > > > Subject: About those "non-deterministic content model" errors
> > > > 
> > > > 
> > > > These errors have been causing me no end of headaches, and it 
> > > > seems to me
> > > > XSV should be able to figure out what to do.
> > > > 
> > > > I'm in the situation of having to implement the JDF spec, 
> > > > which is being
> > > > developed by a printing industry consortium.  The JDF spec 
> > > > does not include
> > > > an XML Schema yet, but it seems relatively easy to figure out 
> > > > from their
> > > > models what the schema will look like, so I've taken a stab a 
> > > > writing one.
> > > > The spec forbids the use of derived types for extensions 
> > > > except in the case
> > > > of "private extensions".  So I'm assuming they will put 
> "any" and
> > > > "anyAttribute" at all the extension points.  However, we 
> > > > would like to use
> > > > derived types in spite of their prohibition, and just say 
> > > that we are
> > > > defining private extensions.  So you get definitions like:
> > > > 
> > > > <element name="ResourcePool" type="jdf:ResourcePool"/>
> > > > <complexType name="ResourcePool">
> > > >    <complexContent>
> > > >       <extension base="jdf:GenericContent">
> > > >          <choice minOccurs="0" maxOccurs="unbounded">
> > > >             <element ref="jdf:Resource"/>
> > > > <!-- Extension resources are allowed.  They must have the 
> > > structure of
> > > >      JDF resources, but JDF doesn't allow the use of 
> derived types
> > > >      to define them.  We will use derived types anyhow, but 
> > > > to be prepared
> > > >      for non-derived resources from 3rd parties . . . -->
> > > >             <any namespace="##other" processContents="lax"/>
> > > >          </choice>
> > > >       </extension>
> > > >    </complexContent>
> > > > </complexType>
> > > > 
> > > > Then we define in our namespace new types derived from 
> > > > Resource.  Using our
> > > > derived types in an instance document then causes the 
> > > > "non-deterministic
> > > > content model" schema error.  But since we declare the 
> > > > substitutionGroup of
> > > > our elements to be "jdf:Resource", it seems to me that a 
> > > > schema validator
> > > > should try to use the more restrictive validation path.  That 
> > > > is, it could
> > > > have a rule that says, if you can validate this without 
> > > > resorting to "any",
> > > > do so. Otherwise, use "any".
> > > > 
> > > > What do you think?
> > > > 
> > > > --Judy
> > > > 
> > > > 
> > > > 
> > > > -----Original Message-----
> > > > From: ht@cogsci.ed.ac.uk [mailto:ht@cogsci.ed.ac.uk]
> > > > Sent: Friday, January 05, 2001 4:51 PM
> > > > To: Slein, Judith A
> > > > Cc: 'xmlschema-dev@w3.org'
> > > > Subject: Re: False "undefined type" error from XSV
> > > > 
> > > > 
> > > > Can't reproduce with the current version, sorry.  Try 
> upgrading to
> > > > XSV11.EXE, and try again.
> > > > 
> > > > Here are the error messages I get from the current version:
> > > > 
> > > > <schemaError char='55' line='371' phase='instance'
> > > > resource='file:///projects/lt
> > > > g/users/ht/xml/xmlschema/monk/slein/JDF.xsd'>non-deterministic
> > > >  content 
> > > > model for
> > > >  type ResourcePool: {Wildcard:
> > > > ##other}/{http://www.xerox.com/xmlschemas/DigiFin
> > > > ish}:BindingIntent</schemaError>
> > > > <schemaWarning char='31' line='99' phase='instance'
> > > > resource='file:///projects/l
> > > > tg/users/ht/xml/xmlschema/monk/slein/JDF.xsd'>restricting a 
> > > list with
> > > > facets not
> > > >  implemented yet</schemaWarning>
> > > > <schemaWarning char='31' line='99' phase='instance'
> > > > resource='file:///projects/l
> > > > tg/users/ht/xml/xmlschema/monk/slein/JDF.xsd'>restricting a 
> > > list with
> > > > facets not
> > > >  implemented yet</schemaWarning>
> > > > <schemaWarning char='31' line='99' phase='instance'
> > > > resource='file:///projects/l
> > > > tg/users/ht/xml/xmlschema/monk/slein/JDF.xsd'>restricting a 
> > > list with
> > > > facets not
> > > >  implemented yet</schemaWarning>
> > > > <schemaError char='63' line='532' phase='instance'
> > > > resource='file:///projects/lt
> > > > g/users/ht/xml/xmlschema/monk/slein/JDF.xsd'>non-deterministic
> > > >  content 
> > > > model for
> > > >  type ResourceLinkPool: {Wildcard:
> > > > ##other}/{http://www.xerox.com/xmlschemas/Dig
> > > > iFinish}:VerificationIntentLink</schemaError>
> > > > 
> > > > ht
> > > > -- 
> > > >   Henry S. Thompson, HCRC Language Technology Group, 
> > > > University of Edinburgh
> > > >           W3C Fellow 1999--2001, part-time member of W3C Team
> > > >      2 Buccleuch Place, Edinburgh EH8 9LW, SCOTLAND -- (44) 
> > > > 131 650-4440
> > > > 	    Fax: (44) 131 650-4587, e-mail: ht@cogsci.ed.ac.uk
> > > > 		     URL: http://www.ltg.ed.ac.uk/~ht/
> > > > 
> > > 
> > 
> 
Received on Thursday, 11 January 2001 14:28:48 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Sunday, 6 December 2009 18:12:49 GMT