Hello

I join late this interesting discussion which was initialized by George Bina after we had some discussions on one *big* model.
The presentation of this model may give new inputs/thoughts to the discussion.
The problem is "How can we design and write very big schemas, made of dozens and dozens of included or imported schemas".

The model being studied is the combination of S1000D and DITA which, all in all, represents about 1850 elements and 650 attributes.
Such big schemas must be designed in a way that makes their writing and maintenance clear and simple.
They must also be designed in a way that eases the testing and that makes simple the management of the modifications of components.
For the interconnection of S1000D and DITA, the result is a set of 148 schemas, all linked together (you will find them in the enclosed zip file).
A UML representation of this big model has been done : UML was the only good way to design, define, verify and manage correctly the logic of all the <xs:include>, <xs:redefine> and <xs:import> used in those schemas (The UML rep. is in the enclosed PDF).

One challenge was to have each included or imported schema being valid on its own : Confirming what Michael Kay has written, I think that it is not humanely acceptable to manage big schemas where included schemas are "partially valid" or "valid only when they are included in including schema" or "valid when used with some instances and invalid for the others". Checking the global validity would be a mess and be probably just impossible to realize.
Since S1000D is the standard for many industries (the entire Aerospace & Defence industries worldwide), it is not possible to propose schemas to this industry which would not be perfectly valid, with no doubt. It is an obligation that the normative committee of S1000D can declare those schemas valid independently from any context. This will be even more true if we want to propose a S1000D+DITA big model to this industry.
It is mandatory that the EPWG (S1000D Electronic Publications working Group) can test, and easily discuss on, any component of the S1000D model.
Flexibility is mandatory for big schemas : this is why it was decided to make a design based on the concept of "logical libraries of components" (complex types, simple types, attributes, attributes groups, simple elements, complex elements...).
A group of industries working on the same project, should easily write a new sub version of the original schema(s).
Handling recursive inclusions is inevitable and should not be a problem to a parser (a programmer can easily do it).

The enclosed zip is containing the big 148 schemas, all being linked together and declared valid by all the parsers except Xerces which is considering that recursive inclusions are errors.

Best regards
Jean-Jacques