- From: <Erwin.Smout@ksz-bcss.fgov.be>
- Date: Wed, 16 Apr 2003 13:30:38 +0200
- To: xmlschema-dev@w3.org
Hello, Recently, I raised an issue here at work regarding global and root elements in xml-schema. Our xml-specialist did not have an answer immediately, but later pointed me to a discussion about the subject : http://lists.w3.org/Archives/Public/xmlschema-dev/2001Jun/0074.html. I must say I didn't feel comfortable with some statements made there, and thought I might add my point of view on the subject. Mr. Mendelsohn states that someone might want to be able to have two different elements as a root. I really don't see how this could be a necessity to anyone. The root-element itself enables you to name the schema that rules the xml-document. It is perfectly possible to refer to a BOOKLIST.XSD in a <BOOKLIST> root and refer to a BOOK.XSD in a <BOOK> root. With proper include-mechanisms in place, there is little extra effort involved in having these two different schemas, instead of only one that allows different root-element-types. So I can't really agree with him there. And I totally can't agree with what is said about "partial validation". This goes against everything xsd stands for. I clearly recall having read the guidelines saying that "a parser should stop passing data from the moment it finds an error. Furthermore, programs receiving an error-message from a parser should consider all data they already parsed from the document as non-existant". This leads me to conclude that "valid xml" (according to xsd) is (meant to be) an all-or-nothing proposition. There is no such thing as "partially valid". And the fact that some programmer might want to do something like partial validation, is not a good reason to "accept" this line of thinking. Programmers have been interpreting standards and guidelines in this fashion ("I will use what comes to good use and ignore whatever I don't like") for as long as I remember (unfortunately). They have always been and will always stay the main reason why so many efforts toward standardisation prove useless and simply fail. Think about it for a moment. Two organisations (be it two companies, or a company and the government, or two departments within a company, or whatever ...) decide to exchange data about, let's say, "customers" in xml-format. They agree on a <customer> root-element which holds several subordinate elements, <custnr> (mandatory), followed by either a <legalperson> element, or a <naturalperson> element. The <legalperson> contains <name> and <legalform> elements, the <naturalperson> contains <surname>, <firstname> and <initials> elements. Now, in this example, if one side sent an xml-form with only a <firstname>-element (and thus without the customer number), then a validation process based on xsd would not mark this form as "invalid", even though elements which were clearly intended and declared to be mandatory (<custnr> e.g.), aren't there at all ? Come on guys, let's be serious for a moment. It would seem obvious to me that : a) a receiving party cannot do anything with just the <firstname> element, it will always need at least the customer number, before it is able to perform whatever useful processing it could do with this message. b) a receiving party would therefore expect its "validation process" to mark this "<firstname>-only" message as "invalid", because it lacks essential data. Rightfully so. c) If the receiving party cannot rely on xsd to do just that, then what good is xsd anyway to anybody ? I think this little example shows clear enough that there is indeed a need for being able do designate some element as being the root in xmlschema. Now for how to achieve this ? To do that, we need some information that enables us to distinguish between an element that is "global", and which element(s) is(are) actually present (or possibly present) in the xml described by the schema. In fact, these "global" elements apparently serve the purpose of "declaring" the structure of some type of element, not declaring the (possible) presence of such element in an xml-document. Apparently, xsd now has two distinct meanings for the <element>-element : 1) as a declaration of a certain type that can be referred to later in the schema. 2) as a declaration of the possible occurrence of such element in an xml-document. To my idea, this is flat out WRONG. If two distinct sorts of information are needed (here the "type-declaration" and the "xml-element-declaration", then they should have different names, or be recognisable as such in whatever way is appropriate. The xsd-syntax apparently does not allow this. There is no way to determine unambiguously what "meaning" has to be assigned to an <element> in a schema. I feel this is a major design error in the xsd syntax, which should be removed as soon as possible. Designers do have a way to avoid this problem (by using <simpletype> and <complextype> for declarations, and using <element> for actual xml-element description, assigning them type-information by "type=typeref"), but this is no solution for someone writing a schema-validation process. The authors of schema validation processes cannot rely on the fact that every schema-author will use this method.
Received on Wednesday, 16 April 2003 08:09:42 UTC