- From: Roger L. Costello <costello@mitre.org>
- Date: Mon, 30 Dec 2002 09:35:33 -0500
- To: "Xmlschema-Dev (E-mail)" <xmlschema-dev@w3.org>
- CC: "Costello,Roger L." <costello@mitre.org>
Hi Mark, Mark Feblowitz wrote: > > Of course, such an approach would require innovations in parsing > technologies, since the loading and processing of what could be > hundreds of schemas for a reasonably sized xml document would be > prohibitive. There are a few standards out there that essentially have > one schema file per chunk, and they are notoriously slow to be > validated. Extra machinery such as a schema repository or pre-assembly > of the full collection of chunk schemas would be required. You make an excellent point Mark. If we were to use the schema chunk idea - with one schema file per chunk - with today's style of creating large instance documents ... then validation would be prohibitively slow and expensive. However, that assumes that creating large instance documents is a good thing. I will argue that it is not. One design approach is to exchange (between sender and client) a few documents, each document containing a lot of data. That is, send large instance documents. Advantages: - may make efficient use of bandwidth Disadvantages: - Oftentimes a client doesn't need all the data, just a portion of it. An alternative design approach is to exchange a lot of documents, each document containing a little data. Advantages: - The client can be sent just the data he/she desires Disadvantages: - may make less efficient use of bandwidth I will argue that it is typically better to lean towards the later design approach - exchange small instance documents. Note that this is also consistent with the XML Streaming approach. So, not only do I advocate the creation and use of "schema chunks", I also advocate small instance documents. > Another down side of this approach is the management of similar, > derived concepts. For concept A' to be derived from concept A, either > the schema for A' must be dependent on the schema for A, or the > information content from A must be replicated in A', and we all know > how difficult it is to maintain definitions that result from > replication (especially those who've struggled with derivation by > restriction on any reasonable scale). Yes, you are absolutely correct. With the schema chunk approach you may end up repeating things in multiple chunks. It boils down to this tradeoff: independent components versus reusable type hierarchies. My experience is that schema type hierarchies make schemas overly complex and brittle. I cannot tell you how many schemas I have seen with type hierarchies 7 levels (or more) deep. These schemas are virtually impossible to understand by anyone other than the original schema designers. On the other hand, with independent components they have a specific use, specific semantics, they are easy to understand, and can be plugged in to a lot a different uses. From my perspective, simplicity and "pluggability" are of most importance. I am willing to sacrifice the slight benefits of type reuse to gain the benefits of using rock-solid components. Thanks for your comments Mark! /Roger
Received on Monday, 30 December 2002 09:35:52 UTC