Hello
I join late this interesting discussion which was initialized by
George Bina after we had some discussions on one *big* model.
The presentation of this model may give new inputs/thoughts to the
discussion.
The problem is "How can we design and write very big schemas, made
of dozens and dozens of included or imported schemas".
The model being studied is the combination of S1000D and DITA which,
all in all, represents about 1850 elements and 650 attributes.
Such big schemas must be designed in a way that makes their writing
and maintenance clear and simple.
They must also be designed in a way that eases the testing and that
makes simple the management of the modifications of components.
For the interconnection of S1000D and DITA, the result is a set of
148 schemas, all linked together (you will find them in the enclosed
zip file).
A UML representation of this big model has been done : UML was the
only good way to design, define, verify and manage correctly the
logic of all the <xs:include>, <xs:redefine> and
<xs:import> used in those schemas (The UML rep. is in the
enclosed PDF).
- One challenge was to have each included or imported schema
being valid on its own : Confirming what Michael Kay has
written, I think that it is not humanely acceptable to manage
big schemas where included schemas are "partially valid" or
"valid only when they are included in including schema" or
"valid when used with some instances and invalid for the
others". Checking the global validity would be a mess and be
probably just impossible to realize.
- Since S1000D is the standard for many industries (the
entire Aerospace & Defence industries worldwide), it is not
possible to propose schemas to this industry which would not be
perfectly valid, with no doubt. It is an obligation that the
normative committee of S1000D can declare those schemas valid independently
from any context. This will be even more true if we want
to propose a S1000D+DITA big model to this industry.
- It is mandatory that the EPWG (S1000D Electronic Publications
working Group) can test, and easily discuss on, any component of
the S1000D model.
- Flexibility is mandatory for big schemas : this is why it was
decided to make a design based on the concept of "logical
libraries of components" (complex types, simple types,
attributes, attributes groups, simple elements, complex
elements...).
- A group of industries working on the same project, should
easily write a new sub version of the original schema(s).
- Handling recursive inclusions is inevitable and should not be
a problem to a parser (a programmer can easily do it).
The enclosed zip is containing the big 148 schemas, all being linked
together and declared valid by all the parsers except Xerces which
is considering that recursive inclusions are errors.
Best regards
Jean-Jacques