- From: <noah_mendelsohn@us.ibm.com>
- Date: Tue, 21 Oct 2003 11:10:27 -0400
- To: Dean Hiller <dhiller@avaya.com>
- Cc: Dare Obasanjo <dareo@microsoft.com>, xmlschema-dev@w3.org
Dean Hiller writes: >> If XSD was object-oriented, it would only care about the superclasses unless Company C had companyB's xsd so they could validate the new feature too. >> I think XSD would be so much easier to go from version to version of all the different protocols out there if XSD was more object oriented. I am the first to agree that versioning of vocabularies is crucial to the success of XML, and I have for three years been outspoken in my disappointment that the W3C community hasn't done a better job of tackling that issue. I think that (a) doing this right is a very, very hard problem (b) there is quite a range of important use cases ranging from handling of small bug fixes, such as allowing a new attribute or new attribute value, to rather major changes to parts of a vocabulary and (c) there is some reason to suspect that in an ideal world, the solutions would involve not just schema, but perhaps changes to the way namespaces work, and perhaps new models of processing XML. That said, I wouldn't leap too quickly to the assumption that "if only schema had exploited the well known principles of object orientation, we'd all be fine by now." First of all, months of study were given to the use of object orientation in schemas, and when you look carefully, there really are a number of reasons that people use object orientation, and interop across versions is just one. For example, object orientation is commonly used to allow for reuse and evolution at the source level. I believe that XML schema succeeds at this to a degree, both in the type system and with substitution groups. The more fundamental point I wanted to make is that I think your statement misses a key point about object orientation. Indeed, it probably took a year of work on XML and schema for me to learn or at least notice this: in fundamental ways, XML (not just schema) and object orientation are at odds. The fundamental premise of object orientation in programming is that state is hidden behind behavior. You don't expose data, you expose methods, and typically you expose those methods by name. In certain respects, XML is a reaction to object orientation: when people tried to loosely couple the method-oriented object structure of their systems using COM and Corba, the resulting contracts were too detailed for some purposes. It turns out that when I want to order a book from Amazon, I don't want to know about the object structure (if any) if their implementation, I just want to send them a document representing a book order. Indeed, many different operations can be performed on that document including signing it, filing it, updating it with costs and expected shipping dates etc, but unlike an object, a document carries no behavior? Why does this relate to the versioning issue? Well, look at how object oriented systems achieve the behavior to which you refer. They impose a level of indirection, often referred to as virtual method calling, between those looking for data and those providing it. Thus, subclasses can come and go, as long as the contract at the method level is preserved. If you have interfaces, then sometimes only the pertinent subset of the contract needs to be preserved as the system evolves (I.e. you need to support the interface of interest.) XML is completely different. I can write all the schemas I want to check your XML document, but when it's handed to your application the data is right there for you to see. It's not just the schema language that needs to adapt to versions, it's your application. Consider a simple example where someone extends a phone-number element to include a country code attribute. With some work one can write a schema for the original application that will accept the new, previously unknown country code element when it shows up. The question is, what does the app do with it? The schema may ignore it, but a dialing applicatoin better not. Indeed, in the US, it would need to know to prefix it with an 011 if the code is other than country code 1, and dial that country code. In an object-oriented programming system, you'd have a new dial method that would overload the old one, and would preserve the interface. Now you can see one of the reasons we didn't do multiple inheritance. With single inheritance we can at least have the (somewhat messy) convention that extensions are at the end of the structure. What do you do with multiple inheritance? How do you help an application keep track of where one base class is contributing content and then another? I'm not saying it's impossible, but I am saying the issues are quite different from what you find in an OO system, where behavior wraps data. Finally, I believe this is one of the reasons that refinement and extension are called out separately in schema, something which is less common in OO systems. Java doesn't particularly distinguish a subclass which just provides a subset of the data of its base (e.g. restricts integers to values < 65535) from subclasses that provide extended behavior; both are declared in the same way. In a data system like schema, it's much more helpful to call out "this subclass is really a subset of the base", because that's the case where you know that an old application can handle the new data. So, I agree versioning is important. I'm less convinced that the answer for XML is primarily to be found in the world of object orientation. XML is data without behavior, somewhat the opposite of OO. ------------------------------------------------------------------ Noah Mendelsohn Voice: 1-617-693-4036 IBM Corporation Fax: 1-617-693-8676 One Rogers Street Cambridge, MA 02142 ------------------------------------------------------------------
Received on Tuesday, 21 October 2003 11:11:10 UTC