- From: Rick Jelliffe <ricko@gate.sinica.edu.tw>
- Date: Fri, 8 Oct 1999 22:03:14 +0800
- To: <www-xml-schema-comments@w3.org>
Variation is a fact of life for languages (e.g., structural schemas) as much as anything else. Taking HTML as a primary example, we find three official variants defined at W3C, which in turn are abstractions of many variants implemented through different generations of browsers. Similarly, a document going through a workflow may have different levels of validation that are appropriate: either because the structure through the workflow changes or because the system designer does not want to know about particular validity errors at a certain stage. Similarly, it is clear that a markup language evolves over time. This needs to be made manageable and simple. The current WD does not have UP-TO-DATE sections that address this issue; the architectural refinement section is the closest. It does not have concept of subset variants. I can easily imagine that other efforts (for example, based on schema composition) may also miss this essential characteristic unless it is explicitly addressed. Subset variants can readily be handled using the following mechanism: * every element, content model, archetype reference, element reference and attribute group reference, etc. should take an extra attribute "variant" which contains a list of names. e.g. <element name="blink" type="inline" variant="html-slack" /> * this parameter can be used in two ways. - a document instance can be validated against the schema in the usual way; the validator can provide a list of which variants were found during the parse. - a document instance can be validated against the schema allowing only a provided set of variants; the validator reports "schema-valid" or "schema-invalid" only. This provides much stronger typing. A variant attribute would also be useful for creating editing tools during a workflow, to accomodate a division of labour: the operator might deem metadata to be a "variant" and then validate the document against everything except the metadata. A variant attribute therefore only makes sense on the roots of non-required structures. The advantages of this approach are, I believe: * convenient and obvious to compute * intuitive for schema-writers * avoids the problems of multiple-inheritance * does not require tracing through a chain notionally (and perhaps physically) separate schemas to resolve * allows the specification of reduced content models; it seems that the issue of how to extend existing schemas has been taking the WG's time rather than the issue of how to subset the schema; * allows convenient description of HTML instead of 3 DTDs; * provides a mechanism for "modular HTML" as well. I suspect that this approach may also simplify the issue of schema extension: a "composition" or "inheritance" or "refinement" system may more comfortably do its thing for superset or piecemeal schemas. (I suppose the alternative to this approach is to use some kind of "exclusion" schema, in which a list of exclusions is associated with some model. This has all the disadvantages of being externally specified, verbose, unintuitive, and poor modeling.) (Note: it may be that make this proposal workable, there may also need to make the variant names first-class, with declarations and a URL. This is a different issue.) Theoretically, a "variant" schema is a subtype of the main schema, but not declared using an inheritence mechanism. Furthermore, because more than one variation can be in operation at any time, a variation is perhaps better thought of as the reification of a module where that module may have effects thoughout the schema. It may be argued that the idea of "variants" is out of line with formal computer language notions. I would note instead that that grammar systems modeling real-world phenomena often need exactly this kind of factility: I note the presense of "guards" on transitions in UML statechart diagrams, and the notion of phases in states which is used in some engineering modeling (p hase represents persistent data between invocations of a state). Furthermore, as noted, the existence of variants as described above provides no algorithmic challenges to an implementer or theoretical challenges for schema composition. I commend this to the Schema WG as an official comment on the current working draft. Rick Jelliffe Computing Centre Academia Sinica (W3C Member) Taipei, Taiwan
Received on Friday, 8 October 1999 10:07:15 UTC