- From: Mark Feblowitz <mfeblowitz@frictionless.com>
- Date: Wed, 17 Apr 2002 10:53:19 -0400
- To: "'Jeni Tennison'" <jeni@jenitennison.com>
- Cc: Paul Kiel <paul@hr-xml.org>, xmlschema-dev@w3.org, "Alexander Falk (E-mail)" <al@altova.com>
Yup - you got the gist. I some regards, the order of the content isn't even a firm requirement - just the existence of the content. The fact that order comes into play could either have been based on a hard ordering requirement, but in most cases it's a side effect of the limitations on the use of "all". On the subject of separating the concerns (cardinality from structure/content), there's another aspect: whether the cardinality constraints are kept with the "structural" schema or whether they are separate (in separate files). The example, (E1, E2, E3, E4) & E2? & E3+, implies that they are kept together. What would be better would be that the base structural schema, represented by (E1, E2, E3, E4), be kept separate from the cardinality-constraining schema, E2? & E3+, e.g., in a separate xsd file. In this way, one could have a single structural base and then constrain it in many different ways (one way per differing context, e.g., one way per differing transaction). Another benefit of separating out the cardinality constraints is that they can be modified or new ones added without having to rev the structural base - an important factor in maintaining a schema over time. Schematron is indeed well suited for this task, since the rules can be maintained separately, and there can be multiple sets, one for each different blend of cardinality constraints. The fact that Schematron uses XPath expressions is the real strength here; that's what enables the constraint to be overlaid on the correct part of the structural schema. The part that most troubles many end users about using Schematron (aside from its sci-fi name) is that the constraints are applied by a separate utility - not the parser. Even if there were no significant performance penalty to running Schematron (i.e., even if it were run in the same JVM), there is still a perception problem on the part of many users, who worry about the extra logistics of running a separate pass and about the perceived added complexity. Since Schema validators must have XPath expression evaluation capabilities in order to process key and unique expressions, it would seem like a small step to also support the evaluation of other XPath expressions, e.g., those used to express overlaid cardinality constraints or those used to apply co-occurrence constraints. In this way, the evaluation of the constraint expressions could be done "by the parser", at least as perceived by most Schema users. In one possible scenario the base structural schema (Person) plus the cardinality constraints (personTransaction1Constaints) could be brought in via includes, and the parser could check both "at the same time" (likely in separate passes). All the user would need to know is how to express the constraints; they wouldn't need to develop an architecture for applying a separate checking step. BTW, I know that the folks on the Schema Working Group can't comment on what they're considering for 1.1. I just want to make sure that these discussions are being read by and considered by at least some of the members. Is this list the correct forum, or is that another one that I should also be CCing? On the non-rhetorical part, I'm still unclear on how a group G in namespace ns1 could be referenced in a type T in ns1, for T to be extended in namespace ns2, e.g., by extending ns1:G. A redefine of ns1:G to add new content (e.g., elements defined in ns2) could be the way to go, but seems tricky: in order to extend ns1:G with content from ns2 and then use it in ns2, I redefined it in ns1 (replicating the original content?), import the additional content from ns2, and then make sure that ns2 imports and that ns2:T uses the redefinition. Hmmm. Thanks, Mark ---------------------------------------------------------------------------- ---- Mark Feblowitz [t] 617.715.7231 Frictionless Commerce Incorporated [f] 617.495.0188 XML Architect [e] mfeblowitz@frictionless.com 400 Technology Square, 9th Floor Cambridge, MA 02139 www.frictionless.com -----Original Message----- From: Jeni Tennison [mailto:jeni@jenitennison.com] Sent: Wednesday, April 17, 2002 7:11 AM To: Mark Feblowitz Cc: Paul Kiel; xmlschema-dev@w3.org Subject: Re: Schema Design: Composition vs Subclassing Hi Mark, > What we recognized is that each use of a particular Noun shares a > common structure (same child elements, same order), and that they > only differ in the cardinalities of their child elements. That's why > we factored out the cardinalities from the definition of structure: > we define the structure of a thing once, and we define the possible > combinations of the cardinalities of its parts separately. This is extremely interesting. You're making a distinction between: - the order in which elements appear - the number of times those elements appear Traditionally, content models have combined these two factors. When a validator goes through an instance it checks the number of occurrences of an element at the same time as checking that they're appearing in the correct order. But you could imagine a content model that expressed the two factors independently. For example, given a DTD content model of: (E1, E2?, E3+, E4) you could have one part that said that E1, E2, E3 and E4 must appear in order, followed by a second part that said that E2 was optional and E3 had to occur one or more times. Making up a syntax: (E1, E2, E3, E4) & E2? & E3+ A validator could feasibly go through an element's content twice, once checking order and once checking cardinality, or it could combine the two parts to create a traditional content model. Separating out these two parts would enable you to vary one while keeping the other the same. So for example you could say that all Nouns shared the same order of elements and swap in different cardinality constraints as and when required. As far as I know, the only schema technology that enables you to make that kind of division explicitly is Schematron -- you can have one set of rules that test ordering, and another set of rules that test cardinality. When RELAX NG had 'concur' you could have used that (essentially imposing two overlapping sets of constraints on the content model); you could still use it with TREX I suppose. But this is essentially what you're doing -- using XML Schema to provide the ordering constraints (which means that you have to be very flexible about the cardinality, essentially not testing it or testing it within known limits) and another technology to provide concurrent validation of the cardinality constraints. This is interesting in terms of schema language development because it implies that something like 'concur' would be a useful addition to the language. You could imagine it solving some of the problems that people have with "these elements, in any order, with these cardinalities" as well. It's also interesting in terms of the relationship with OO technologies. In OO technologies, ordering isn't an issue, only cardinality, so the normal "best practice" for OO probably isn't going to help here. Design patterns for OO technologies simply don't have to deal with this kind of issue. Anyway, after all that, to answer the non-rhetorical question in your post: > (which reminds me, are groups extensible? How does one do so?) They're only extensible (while retaining the same name) when you redefine schema. When you use xs:redefine, you can change model groups as long as either: - you reference the model group within the new model group (essentially allowing you to extend it) or - the redefined model group is a valid restriction of the original model group (essentially allowing you to restrict it) Of course you can reference model groups with different names wherever you like within a model group -- that's equivalent to extending complex types within a single schema. Cheers, Jeni --- Jeni Tennison http://www.jenitennison.com/
Received on Wednesday, 17 April 2002 10:54:36 UTC