- From: <Noah_Mendelsohn@lotus.com>
- Date: Tue, 26 Jun 2001 21:22:15 -0400
- To: Kohsuke KAWAGUCHI <kohsukekawaguchi@yahoo.com>
- Cc: ochipara@cse.unl.edu, xmlschema-dev@w3.org
Kohsuke KAWAGUCHI writes: >> My following article tries to propose a subset of XML Schema: >> http://www.geocities.com/kohsukekawaguchi/XMLSchemaDOsAndDONTs.html While everyone will have different opinions as to which features are best skipped in a subset (I would put key/keyref high on the list), I think that many of your suggestions are reasonable starting points either for novices or perhaps for others wishing to use a more restricted language. I do feel I should point out one aspect of your proposal that isn't quite right: you suggest that use of complex types be avoided. If you study the schema design carefully, you'll realize that this suggestion means that no elements can have attributes, and no elements can have other elements in their content. Surely that is not what you intended. The following explanation is adapted from a note I wrote earlier today on the same subject: In general, you can think about every element as having a complex type, except in the special case where its content happens to be a simple type such as integer, with no attributes. Another way to think about it: we intended complex types as the types you use on elements, simple types as the ones you use on attributes. Start with that assumption and you will be thinking right about most of the design. That said, since all the content that is legal on attributes, such as integers, is also legal on elements, we faced a choice. One way would have been to provide a complex analog for every simple type. Had we done that, then all elements would have complex type, and all attributes simple. On balance, we decided it would work better to just allow elements to have either simple or complex type. Still, there is a sense in which complex types are the types for elements, and the ability to use simple types on elements is just a convenience. So, what you can do if you prefer is not to separately name your complex types; you can do them all anonymously as part of element declarations. What you lose, if that's the simplification you intended, is the ability to model the commonality in data such as: <WIDTH Units="cm">20</WIDTH> <HEIGHT Units="cm">40</HEIGHT> A plausible way to declare this in XML schema is: <complexType name="measurementType"> <simpleContent> <extension base="integer> <!-- following could be enumeration of cm, in, feet, etc. --> <attribute name="Units" type="string"/> </extension> <simpleContent> </complexType> <element name="WIDTH" type="measurementType"/> <element name="HEIGHT" type="measurementType"/> So using a named complex type in this situation correctly captures what is common between the two types of element. Higher level programs may realize that the same Java class or C structure can be used to hold either a width or a height. In general, when you generate the obvious Java mappings from Schema, it's one Class per complex (or simple) type, one member variable per element or attribute. If the shape is a square, you can safely copy the data from width to height. You can validate the same data without a named complex type (I.e. just define width and height separately), but you have to keep the definitions in sync, and you don't have any formal way to indicate that the structures are indeed common. If you mapped to a language such as Java, you'd probably get two classes where one would have been more appropriate. Whether or not you choose to use explicitly named complex types, such as the one in the example, you will definitely need at least anonymous complex types for many of your elements. I hope this helps to clarify the design and the terminology. Regarding the need for the explicit markup such as <simpleContent>. I am not that fond of it. Earlier versions of our design had less of that, but they suffered for having lots of optional attributes on various schema constructions. So, for example, there was a base= attribute on types regardless of whether or not a derivation was actually being done. The more verbose markup was to eliminate such optionality, and to allow schemas themselves to be more rigorously validated and thus more easily manipulated. I do think it is a nuisance when one is writing schemas manually. ------------------------------------------------------------------------ Noah Mendelsohn Voice: 1-617-693-4036 Lotus Development Corp. Fax: 1-617-693-8676 One Rogers Street Cambridge, MA 02142 ------------------------------------------------------------------------
Received on Tuesday, 26 June 2001 21:27:25 UTC