- From: Steven R. Newcomb <srn@coolheads.com>
- Date: Wed, 25 Oct 2000 13:06:16 -0500
- To: clbullar@ingr.com
- CC: elharo@metalab.unc.edu, xml-dev@xml.org, www-xml-schema-comments@w3.org
[Len Bullard:] > [a bunch of interesting words about layered systems > that I didn't really understand] > ...Perhaps you could detail the concept of > ready to run information. Consider the interchange form (an XML message) of a purchase order. It would be a bad idea to include a total amount to be paid, since an explicit total would be redundant. If somebody tweaked the interchange form, the total would be inconsistent with the rest of the message, and there would probably be no easy way for the recipient to determine which information is invalid. In general, then, it's a bad idea to include redundant information in interchange messages, not just because it uses bandwidth, but more importantly because it is very likely to cause ambiguity. Consider the form of the purchase order when it is "ready to run" -- when an API is provided to the information it contains. It's very reasonable to provide such an API with a "total()" method. Redundancy in APIs is good; APIs are supposed to be convenient to use. total() gives access to an "emergent property" (as opposed to an explicit syntactic property) of the information set found in purchase orders. Of course, while total() makes sense for purchase orders, it doesn't apply to many other kinds of XML messages. The grove paradigm fully recognizes that information can have multiple levels of interpretation applied to it, and that such interpretations have the effect of making implicit information explicit. * When an XML document is processed, a grove of the syntax is the result (we usually call this grove a "DOM tree"). In grove land, each node is an addressable information component. The tree structure that was implicit in the interchange form of the information has become explicit. * But wait, there's more. Vocabularies are used in XML documents, and, depending on the semantics of those vocabularies, there can be properties that "emerge" from the information, when the information is understood in terms of the intended semantics of the vocabularies. In grove-land, these emergent properties appear in additional groves, and those properties, too, are reliably addressable. Thus, the "total" property of a purchase order can become explicit and addressable, even though it was only implicit in the interchange form of the information. The purchase order example is a trivial one that's good for teaching purposes, but it's not very compelling, I think. I find the example of topic map processing much more compelling. The syntactic components of a topic map document are not, and they can never be, fully indicative of their own significance. They can only be fully understood in terms of their connections to many other things whose syntactic whereabouts are necessarily arbitrary. The *whole* topic map document must be understood -- processed -- before the significance of any of it can be fully and reliably understood. Topic map processing causes topic map documents to become things that resemble ready-to-run semantic nets. (Groves are one way to think about these nets -- a way that has the advantage of offering reliable addressability based on international standards -- but the truth is that groves are just one way.) The reason you create topic map documents is to allow these semantic net-like things to be interchanged and merged with one another by their end users and by people who wish to add more value to them in various ways. The nets don't and can't resemble the interchange documents, because of their own very highly interconnected and interdependent nature, and because of the fact that the nature of an interchangeable document is quite different from that of a semantic net. An interchangeable document is nothing more or less than a sequence of characters. So, I repeat what I said in my earlier note: There is this common wisdom out there that the structure of interchanged information should also be, in effect, the API to that same information. But, in fact, it's only true for a simple subset of the kinds of information that need to be interchanged, and to which APIs must be provided. Len, does this speak to what you were saying about layered systems? -Steve -- Steven R. Newcomb, Consultant srn@coolheads.com voice: +1 972 359 8160 fax: +1 972 359 0270 405 Flagler Court Allen, Texas 75013-2821 USA "We're not exactly anti-schema, but we're sure pro-DTD." -- doctypes.org
Received on Wednesday, 25 October 2000 14:07:26 UTC