RE: About those "non-deterministic content model" errors

Judith,

This is a marvelously interesting problem, with two important, but
conflicting goals -

1) Minimize information needs for validating a portion of a document.
2) Maintain type safety and allow controlled evolution.

This is similar to applications which use a validating parser when building
an application, but then turn it off at run time, as only valid documents
should occur, so why encur the cost of validation.  This question arose in
the early days of the WG, and at that time I suggested there is a validation
spectrum, not just valid or well-formed.

I think the slickest answer to this problem would be a place on that
validation spectrum called something like "parse-as-base", where xsi:type
would simply be ignored, and unexpected elements at the end of a content
model would be assumed to come from the ignored derived type.  Then parties
could use type extension ad nauseum (but not substitution groups), but all
these private extensions would be ignored by a parse in "parse-as-base"
mode.  An unextended JDF system would implicitly view the tail of an
extendable type as allowing any content.  There could also be a
"parse-as-known" mode, closer to lax, which pays attention to xsi:type if it
recognizes the type.  This would allow known extensions to be automatically
handled, and unknown ones to be ignored.  (Note that if I ignore the value
of xsi:type, then I don't need to load its definition to parse the element
it occurs on.)  Substitution groups could not be used here because whenever
there is a <choice> and an unknown element shows up, it would be necessary
to get the definition of the element to discover for which element it
substitutes.

Second most elegant solution would be to exercise the (intentional) loophole
in XSDL for locating schemas.  Since it is up to the validator, really, to
decide what schema to use for parsing a document, the JDF structure can
easily finesse the use of extension you are trying to do.  Their base schema
(and only their base schema) would include <any namespace="##other"
processContents="lax" minOccurs="0"/> at the appropriate places (your
problem was from putting an <any> in your extension).  The instances would
replace the any with some element and use schemaLocation to point to the
definitions.  Default processors could simply ignore these hints and process
the new elements laxly.  "Aware" processors examine the hints and therefore
validate the known elements.  They get to validate without loading any
additional schema info, you get to validate with all additional schema info,
or somewhere inbetween.

To a certain extent, both of these do the same thing, except the first
allows you to use the refinement mechanism as was intended, while the second
is ad hoc.

The problem with that kind of private extension mechanism is that it makes
applications harder to write because the element type isn't the _real_
element type - it's some type infered by looking at private extensions, so
extra code needs to be written and maintained to find that information,
while using the extension mechanism provides it automatically.

I think either of the solutions above solves the problem.  The second, while
technically less interesting, requires no new action on the part of the
Schema WG.

Matthew

> -----Original Message-----
> From: Slein, Judith A [mailto:JSlein@crt.xerox.com]
> Sent: Wednesday, January 10, 2001 7:08 AM
> To: Fuchs, Matthew; Slein, Judith A; 'www-xml-schema-comments@w3.org'
> Cc: Sembower, Neil R
> Subject: RE: About those "non-deterministic content model" errors
> 
> 
> Thanks for taking the time to look into this issue.  More below . . .
> 
> -----Original Message-----
> From: Fuchs, Matthew [mailto:matthew.fuchs@commerceone.com]
> Sent: Tuesday, January 09, 2001 4:30 PM
> To: 'Slein, Judith A'; 'www-xml-schema-comments@w3.org'
> Subject: RE: About those "non-deterministic content model" errors
> 
> 
> A cursory glance through section 3.10 leads me to a two part response:
> 
> 1) I'm not sure the authors understand the XSDL extension 
> mechanism.  In any
> case, I'd love a more in-depth discussion than "this mechanism is not
> allowed to be applied to any elements defined by JDF because  such new
> element types can only be understood by the agent that made 
> the extensions"
> given the amount of work done to make sure that that 
> statement is false.
> That you use extension in your example is especially confusing.
> 
> <JS>I had some e-mail conversations with the JDF spec authors 
> last fall in
> which I tried to persuade them to encourage the use of 
> derived types for
> extensions.  Here's the most concise statement of their 
> rationale that came
> out of the discussion:
> 
> They don't want people to derive types from any JDF types in other
> namespaces because:
> <MM>If we would loosen this constraint the problem raises that then
> everybody who reads your JDFs must know what types you derived from
> Component in order to interpret the JDF-standard Component.  
> If you allow
> only extensions by private namespaces, you don't have to know 
> something
> about your extensions in order to understand the standard.
> You are allowed to include your private sub-elements and 
> attributes in the
> Component which leads to the same result has achieved by derivation.
> 
> Derivation is very fine in object oriented sense. My opinion 
> is that the
> power of derivation comes up at first if you introduce 
> methods.  Methods are
> beyond our JDF parameter files.  Derivation of data types for 
> extension is
> nothing else than adding additional parameters.  The use of 
> derivation for
> making abstract parameters instanciable (pendant to virtual functions)
> should not be allowed due to the reasons mentioned above.</MM>
> 
> <MM>
> We want to be able to validate JDF-files by using only the 
> JDF-Schema.  We
> don't want look for other schema files for validation.  If we 
> would allow
> everybody to use private extensions together with schemas, it 
> may happen
> that we need a lot of schemas in order to validate a JDF.  
> Then, if only one
> schema is missing the validation would fail.  Would you accept this
> statement ?
> </MM>
> 
> </JS>
> 
> 2) The use of "any" is not the best way to get what you want, 
> anyway.  One
> possibility is the following:
> 
> <element name="PrivateExtension" type="xsd:AnyType"/>
> <element name="ResourcePool" type="jdf:ResourcePool"/>
> <complexType name="ResourcePool">
>    <complexContent>
>       <extension base="jdf:GenericContent">
>          <choice minOccurs="0" maxOccurs="unbounded">
>             <element ref="jdf:Resource"/>
>             <!-- Extension resources are allowed.  _They_can
> _now_be_JDF_Resources_ 
>                  They can, in fact, be any declared type from 
> any schema,
> whether derived or not -->
>             <element ref="PrivateExtension" minOccurs="0"/>
>          </choice>
>       </extension>
>    </complexContent>
> </complexType>
> 
> An optional PrivateExtension should appear wherever 
> extensions are allowed,
> per JDF.  This ensures, as they want, that "private 
> extensions" can appear
> in any of these places.  It is also clearly marked as a 
> private extension -
> and either refinement or substitution groups can be used to 
> distinguish the
> desired content without any concern for nondeterministic 
> content models.  I
> would consider this the best solution that doesn't allow you 
> to extend the
> base JDF types.  It solves their "issue" (no use of 
> extensions for base JDF
> types) while maintaining type safety.
> 
> If they're really dead set against extensions, then you can 
> give them a
> PrivateExtension type which explicitly allows <any> content.
> 
> <JS>I hadn't thought about this approach.  But I think it is 
> not a solution
> for the case at hand, because JDF wants the instance document 
> to look like
> this (I think):
> 
> <jdf:ResourcePool>
>    <jdf:Component . . .>. . .</jdf:Component>
>    <myns:CD . . .>. . .</myns:CD>
>    <myns:DVD . . .>. . .</myns:DVD>
>    <jdf:GatheringParams . . ./>
>    <jdf:PackingParams . . ./>
> </jdf:ResourcePool>
> 
> </JS>
> 
> Matthew
> 
> > -----Original Message-----
> > From: Slein, Judith A [mailto:JSlein@crt.xerox.com]
> > Sent: Tuesday, January 09, 2001 12:28 PM
> > To: Fuchs, Matthew; Slein, Judith A; 
> 'www-xml-schema-comments@w3.org'
> > Subject: RE: About those "non-deterministic content model" errors
> > 
> > 
> > The JDF specification is at 
> http://www.job-definition-format.org.  The
> > discussion of extensions is in section 3.10.  The notion 
> > "private extension"
> > is not defined.
> > 
> > Basically, the JDF spec discourages the use of derived types 
> > for extensions.
> > This leads to my assumption that it will use "any" at 
> > extension points in
> > its XML Schema.  But using "any" at these points just pushes 
> > the task of
> > enforcement of semantics into JDF-specific code -- the intent 
> > is still that
> > the elements that can appear at these points must be of the 
> > "right sort".
> > In the example, they must have the same structure as if 
> they had been
> > derived from the complexType jdf:Resource, even though 
> > actually deriving
> > them is discouraged.
> > 
> > Since you can get so much more powerful enforcement of 
> > constraints by using
> > derived types, we want to go ahead and do so (in our own 
> > namespace).  But
> > for now, the XML Schema spec is preventing us from doing 
> so, since the
> > schema processor then cannot determine whether it should use the
> > jdf:Resource branch or the "any" branch in the choice to 
> validate our
> > extension element.  We want the jdf:Resource branch to be used.
> > 
> > Anyhow, I don't think this problem will be unique to JDF.  
> > Just imagine any
> > case where I am faced with an XML Schema written by somebody 
> > else, which I
> > want to extend.  It has in it the construct in the example. 
>  Now I can
> > define an element and its type from scratch, independent of 
> > the definition
> > of jdf:Resource.  In that case they will be validated using 
> > the "any" branch
> > of the choice.  But this is a much weaker form of validation 
> > than I could
> > get if there were a rule in XML Schema that in this sort of 
> > case, a schema
> > processor should validate using the more restrictive branch 
> > if possibe, and
> > only use the "any" branch as a last resort.  Then I could 
> > derive a type from
> > the jdf:Resource complexType, and create an element with
> > substitutionGroup="jdf:Resource", and it would be validated 
> > to be sure it is
> > in fact of type jdf:Resource.
> > 
> > --Judy
> > 
> > -----Original Message-----
> > From: Fuchs, Matthew [mailto:matthew.fuchs@commerceone.com]
> > Sent: Tuesday, January 09, 2001 2:49 PM
> > To: 'Slein, Judith A'; 'www-xml-schema-comments@w3.org'
> > Subject: RE: About those "non-deterministic content model" errors
> > 
> > 
> > I do not understand the "co-constraint" between your policy 
> regarding
> > extensions and your use of any.  What is a "private 
> > extension".  Can you
> > give a pointer to the JDF spec to explain these things?  I 
> > can't tell if
> > what you're trying to do is "reasonable" until I can 
> > understand the goal.
> > 
> > Thanks,
> > 
> > Matthew Fuchs
> > 
> > > -----Original Message-----
> > > From: Slein, Judith A [mailto:JSlein@crt.xerox.com]
> > > Sent: Tuesday, January 09, 2001 8:34 AM
> > > To: 'www-xml-schema-comments@w3.org'
> > > Subject: FW: About those "non-deterministic content model" errors
> > > 
> > > 
> > > When I sent this note to Henry Thompson, he suggested that I 
> > > send a comment
> > > to the XML Schema working group.  So here is the example I 
> > > sent him, and his
> > > response:
> > > 
> > > <HT>I think I understand the design goal and agree it's 
> > > reasonable.  The
> > > problem is not with XSV, but with the XML Schema spec. 
> > itself.  Given
> > > the example, the relevant element in the instance could be
> > > accepted by either branch of the <choice>, and that's not allowed.
> > > 
> > > Please send this example to www-xml-schema-comments and 
> > > prefix it with 
> > > the observation that without a version of 'any' which 
> > explicitly does 
> > > _not_ validate anything which would cause a violation of 
> the unique
> > > attribution restriction you can't do what you (quite 
> > reasonably) want
> > > to do.</HT>
> > > 
> > > -----Original Message-----
> > > From: Slein, Judith A 
> > > Sent: Monday, January 08, 2001 9:47 AM
> > > To: 'ht@cogsci.ed.ac.uk'
> > > Subject: About those "non-deterministic content model" errors
> > > 
> > > 
> > > These errors have been causing me no end of headaches, and it 
> > > seems to me
> > > XSV should be able to figure out what to do.
> > > 
> > > I'm in the situation of having to implement the JDF spec, 
> > > which is being
> > > developed by a printing industry consortium.  The JDF spec 
> > > does not include
> > > an XML Schema yet, but it seems relatively easy to figure out 
> > > from their
> > > models what the schema will look like, so I've taken a stab a 
> > > writing one.
> > > The spec forbids the use of derived types for extensions 
> > > except in the case
> > > of "private extensions".  So I'm assuming they will put "any" and
> > > "anyAttribute" at all the extension points.  However, we 
> > > would like to use
> > > derived types in spite of their prohibition, and just say 
> > that we are
> > > defining private extensions.  So you get definitions like:
> > > 
> > > <element name="ResourcePool" type="jdf:ResourcePool"/>
> > > <complexType name="ResourcePool">
> > >    <complexContent>
> > >       <extension base="jdf:GenericContent">
> > >          <choice minOccurs="0" maxOccurs="unbounded">
> > >             <element ref="jdf:Resource"/>
> > > <!-- Extension resources are allowed.  They must have the 
> > structure of
> > >      JDF resources, but JDF doesn't allow the use of derived types
> > >      to define them.  We will use derived types anyhow, but 
> > > to be prepared
> > >      for non-derived resources from 3rd parties . . . -->
> > >             <any namespace="##other" processContents="lax"/>
> > >          </choice>
> > >       </extension>
> > >    </complexContent>
> > > </complexType>
> > > 
> > > Then we define in our namespace new types derived from 
> > > Resource.  Using our
> > > derived types in an instance document then causes the 
> > > "non-deterministic
> > > content model" schema error.  But since we declare the 
> > > substitutionGroup of
> > > our elements to be "jdf:Resource", it seems to me that a 
> > > schema validator
> > > should try to use the more restrictive validation path.  That 
> > > is, it could
> > > have a rule that says, if you can validate this without 
> > > resorting to "any",
> > > do so. Otherwise, use "any".
> > > 
> > > What do you think?
> > > 
> > > --Judy
> > > 
> > > 
> > > 
> > > -----Original Message-----
> > > From: ht@cogsci.ed.ac.uk [mailto:ht@cogsci.ed.ac.uk]
> > > Sent: Friday, January 05, 2001 4:51 PM
> > > To: Slein, Judith A
> > > Cc: 'xmlschema-dev@w3.org'
> > > Subject: Re: False "undefined type" error from XSV
> > > 
> > > 
> > > Can't reproduce with the current version, sorry.  Try upgrading to
> > > XSV11.EXE, and try again.
> > > 
> > > Here are the error messages I get from the current version:
> > > 
> > > <schemaError char='55' line='371' phase='instance'
> > > resource='file:///projects/lt
> > > g/users/ht/xml/xmlschema/monk/slein/JDF.xsd'>non-deterministic
> > >  content 
> > > model for
> > >  type ResourcePool: {Wildcard:
> > > ##other}/{http://www.xerox.com/xmlschemas/DigiFin
> > > ish}:BindingIntent</schemaError>
> > > <schemaWarning char='31' line='99' phase='instance'
> > > resource='file:///projects/l
> > > tg/users/ht/xml/xmlschema/monk/slein/JDF.xsd'>restricting a 
> > list with
> > > facets not
> > >  implemented yet</schemaWarning>
> > > <schemaWarning char='31' line='99' phase='instance'
> > > resource='file:///projects/l
> > > tg/users/ht/xml/xmlschema/monk/slein/JDF.xsd'>restricting a 
> > list with
> > > facets not
> > >  implemented yet</schemaWarning>
> > > <schemaWarning char='31' line='99' phase='instance'
> > > resource='file:///projects/l
> > > tg/users/ht/xml/xmlschema/monk/slein/JDF.xsd'>restricting a 
> > list with
> > > facets not
> > >  implemented yet</schemaWarning>
> > > <schemaError char='63' line='532' phase='instance'
> > > resource='file:///projects/lt
> > > g/users/ht/xml/xmlschema/monk/slein/JDF.xsd'>non-deterministic
> > >  content 
> > > model for
> > >  type ResourceLinkPool: {Wildcard:
> > > ##other}/{http://www.xerox.com/xmlschemas/Dig
> > > iFinish}:VerificationIntentLink</schemaError>
> > > 
> > > ht
> > > -- 
> > >   Henry S. Thompson, HCRC Language Technology Group, 
> > > University of Edinburgh
> > >           W3C Fellow 1999--2001, part-time member of W3C Team
> > >      2 Buccleuch Place, Edinburgh EH8 9LW, SCOTLAND -- (44) 
> > > 131 650-4440
> > > 	    Fax: (44) 131 650-4587, e-mail: ht@cogsci.ed.ac.uk
> > > 		     URL: http://www.ltg.ed.ac.uk/~ht/
> > > 
> > 
> 

Received on Wednesday, 10 January 2001 18:56:26 UTC