- From: David Orchard <dorchard@bea.com>
- Date: Fri, 27 Jun 2003 17:02:57 -0700
- To: <www-tag@w3.org>
Here's a rough start to the extensibility and versioning section of the web arch document, and a small change proposed to 3.2. 3.2 The format specification should be designed for extensibility and versioning. 3.2.2.4 Extensibility XML and XML Namespaces are designed for creating vocabularies and combining them together. Extensibility is the term for combining multiple vocabularies together, or allowing more than 1 vocabulary to be in scope in a document. Good practice: Languages should provide for extensibility Now what is the relationship between versioning and extensibility? A clear relationship is where a schema may be extended to add/change/delete element and attribute definitions. We call this schema and instances of it a new version of the language. But what if a 3rd part adds it's vocabulary elements in without changing the containing schema? Then the containing language has not evolved, but the document instance has. This is a new version of the message. An example is a SOAP message with a header block. We typically call the header block a SOAP extension and not a new version of SOAP. Any changes to the particular message would be considered a new version of the message. Versioning is the term for the evolution of languages and documents. Versioning is achieved through extensibility mechanisms and language redefinition. There are 3 types of version changes that can occur. Incompatible, backwards compatible and forwards compatible changes. In the case of xml documents, backwards compatibility means that a new version of an xml document can be deployed in such a manner as to not break existing agents that process the xml document. This means that a sending agent can send an old version of an xml document to a receiving agent that understands the new version and still have the message successfully processed. Forwards compatibility means that an older version of an receiving agent can receiver newer documents and not break. This means that a sending agent can send a newer version of a document and still have the message successfully processed. Backwards compatibility means that existing sending agents can use receiving agents that have been updated, and forwards compatibility means that newer sending agents can continue to use existing recieving agents. Forwards and backwards-compatible changes are typically the addition of an optional element or attribute. The cost of non-backward or non-forward compatible changes are often very high, as all the software that uses the language must be updated to the newer version. Good practice: Languages should be created with an extensibility model that permits forwards compatible and backwards compatible changes in the language. Forwards compatibility means that a receiver must be able to receive newer content and process the message as if the newer content didn't exist. This newer content is considered optional, and the acting as if it didn't exist is called "ignoring". Language designers need to indicate that optional content that that are not familiar with must be ignored. The mechanism for ignoring can have a few different flavours. One flavour is to simply act as if it doesn't exist, though care must be taken for positional based behaviour. Another mechanism is to replace the element tag with the element's content. Good practice: Languages should specify behaviour for unknown or unrecognized content. A common model is that such content must be ignored. In cases where the newer content is required to be understood, or is mandatory, the language designer may need to provide a mechanism for indicating that the content must be understood. New, mandatory content is not a forwards compatible change. One technique for indicating that new content is required is to change the element names or the namespace names in the message. However, many languages are containers and are designed for extensions. Good practice: Languages that need to indicate mandatory extensions should provide such a facility. An example of this is the mustUnderstand attribute in SOAP. XML and Schema languages require that schemas have deterministic content models. An explanation from the XML 1.0 specification, "For example, the content model ((b, c) | (b, d)) is non-deterministic, because given an initial b the XML processor cannot know which b in the model is being matched without looking ahead to see which element follows the b." Schema languages like W3C XML Schema provide a variety of extensibility mechanisms, such as wildcards and type derivation. The combination of extensibility and determinism can make it difficult to create the optimal schema. As a simple example, in XML Schema, a wildcard that allows extension in any namespace (<xs:any targetNamespace="##any"/>) cannot occur after an element that does not have a minOccurs value equal to the maxOccurs value. If the min/max are different, the processor won't know whether an instance of the element belongs to the element definition or the wildcard. Another example is that a type definition that ends in a wildcard allowing any namespace cannot be extended through derivation. Principle: Languages must account for determinism in the types of extensibility. Cheers, Dave
Received on Friday, 27 June 2003 20:03:51 UTC