RE: XML Schemas: Best Practices (touches XHTML modularization als o)

I don't see xsi:type having any major benefits over substitution groups.  If they did, then I would take it as a sign of a weakness in XML Schema.  I don't think that XML Schema should favor some new
pattern (using the xsi:type to indication specialization) versus a widely deployed pattern (using a different tag name to indication specialization).

I think that you probably despise:

>     <Catalogue>
>         <Publication xsi:type="BookType"> ... </Publication>
>         <Publication xsi:type="MagazineType"> ... </Publication>
>         <Publication xsi:type="BookType"> ... </Publication>
>     </Catalogue>

because traditional XML usage has caused you to expect that the tag name identifies the "type" of the element and that the Publication tag name is adding very little additional information.

There was a message I tried to post a few days ago that appears to have been lost in the ether.

Basically, substitution groups and type inheritance feel like single inheritance.  But there are a wide variety of uses where you have systems where structure and function do not coincide, where two
elements that don't share any common structure (other than ultimately deriving from urType) that should be interchangable.  In the OOP world, this legitimate need gave rise to multiple inheritance and
the concept of interfaces.

If you depend on some very primitive ancestor class/type (such as the urType) to resolve these cases, then you have the equivalent of

class Catalogue {
	void addPublication(Object publication);
}

Where you really can't enforce any decent typing on the content.  In this case, you would have been in almost the same situation if you had used an <any> particle.

So with schema as it stands now, you can:

a) use substitution groups to expand sets that have some commonality and where elements do not participate in multiple sets.
b) use xsi:type to expand sets that have some common ancestor, however the closer that common ancestor comes to be the urType the closer it becomes equivalent to the <any> particle.
c) use <choice> groups to designate a set of elements without any significant commonality to be interchangable.  Unfortunately, this is limited to having to be defined in the schema of usage instead
of the schema  of subtype declaration.
d) use redefine to change a construct from another schema, but without giving the user of that schema any hint (or much control) that such a change is possible.

These have really close parallels to the concepts of single inheritance (a and b), enumerations (c) and creating derivative source code using cut and paste (d).

I'm really troubled by redefine, in that unlike parameter entities in a DTD, it does not give someone looking at the base schema any indication that a modification is to be expected.  Also, I not sure
whether it could handle the case where two different modules each had their own modifications to the same base element or type.

In modularization of XHTML, it seems that what you want to accomplish is close to c but you want to be able to do it at the schema of subtype declaration.  Something pretty close to interfaces.

About a year ago, I posted an message on www-xml-schema-comments (http://lists.w3.org/Archives/Public/www-xml-schema-comments/2000JanMar/0040.html) that proposed using an open choice groups as an
alternative to what where then known as equivClasses.  While I would not suggest that substitutionGroup's be replaced in XML schemas, it would seem that if the concept of open choice and attribute
groups were added as an additional modularization/extensibility feature then the motivation for redefine would be reduced and a equivalent of the traditional use of parameter entities in element
content models and attribute lists could be mimiced.

Basically, an attribute "open" with a default value of "false" could be supplied on named <group> and <attributeGroup> declarations.  A <group> would be required to contain either a single <choice> or
<all> child if the value of open="true" (can't be <sequence> since order is undeterminant).  A value of true for "open" indicates that the choice particle, all particle or attributeGroup may be
augmented by elements or attributes that nominate themselves for membership.

A global element or attribute definition would nominate itself for membership in a group by adding the QName of the group to its "groups" attribute.  The minOccurs and maxOccurs or use and value
attribute from the definition would be asserted in all the groups.  But I think you would typically use minOccurs="1" and maxOccurs="1" or use="optional", so the fixed value in all groups shouldn't be
a huge concern.

For example,

<xsd:element name="Magazine" groups="Publications GlossyThings">
	<xsd:complexType>...</xsd:complexType>
</xsd:element>

<xsd:element name="Book" groups="Publications">
	<xsd:complexType>...</xsd:complexType>
</xsd:element>

<xsd:element name="Cat" groups="Animals GlossyThings">
	<xsd:complexType>...</xsd:complexType>
<xsd:element>

<xsd:attribute name="glossFactor" groups="GlossyThingsAttributes"/>

<xsd:attributeGroup name="GlossyThingsAttributes"/>

<xsd:group name="Animals" open="true">
	<xsd:choice>
		<xsd:element ref="Man"/>
	</xsd:choice>
</xsd:group>

<xsd:group name="Publications" open="true">
	<xsd:choice/>
</xsd:group>

<xsd:group name="GlossyThings" open="true">
	<xsd:choice/>
</xsd:group>

The group definitions would be equivalent to:

<xsd:group name="Animals" open="true">
	<xsd:choice>
		<xsd:element ref="Man"/>
		<xsd:element ref="Cat"/>
	</xsd:choice>
</xsd:group>

<xsd:group name="Publications" open="true">
	<xsd:choice>
		<xsd:element ref="Magazine"/>
		<xsd:element ref="Book"/>
	</xsd:choice>
</xsd:group>

<xsd:group name="GlossyThings" open="true">
	<xsd:choice>
		<xsd:element ref="Magazine"/>
		<xsd:element ref="Cat"/>
	</xsd:choice>
</xsd:group>

But the open form has the advantage that membership in the group is defined at point of element declaration and not possibly in a base schema.

I think this would accomplish most of the traditional uses of parameter entities in content models and attribute lists that cannot be done well with the existing substitution groups, does it without
add any new schema components but as an extension to the XML representation, allows the base schema designer to control the extension points in the schema in a way that <redefine> does not and allows
multiple modules to augment the same type or element in a way that <redefine> does not.

Alternatively, multiple groups could be allowed in substitutionGroups attribute and as long as the schema designer based the examplar on the urType, then you have about the same behavior.  But you
would still want to do something like the open attribute group to support parameter entity or internal subset entries for the attribute list.

Received on Monday, 8 January 2001 17:53:01 UTC