- From: Mary Holstege <holstege@mathling.com>
- Date: Tue, 19 Jun 2001 16:59:47 -0700
- To: www-xml-xinclude-comments@w3.org, w3c-xml-core-wg@w3.org
XML Schema Working Group comments on XML Inclusions Last Call Working Draft We believe that the XInclude specification defines a foundation specification that has to be harmonized carefully with the other foundation specifications. The following points outline our concerns with the specification as it stands; (1) While the Infoset specification countenances synthetic infosets that do not maintain the normal consistency relations of infosets created directly by parsing XML, we consider it a poor idea in general to take advantage of that laxity, particularly in the case of such a foundational specification. We believe the XInclude specification must be crystal clear which Infoset properties are adjusted, and how, and further that it should specify rules so that core invariants are maintained. Since a downstream application has no markers in an Infoset that the XInclude process has occurred, it is unacceptable to create an Infoset that cannot be processed in the normal way. The XInclude specification itself highlights several situtations that call out for special processing. We call on the specification not to satisfy itself with highlighting the problems, but to solve them. Among these are: namespace handling base handling name collisions on notations and entities PSVI properties We find the statement that PSVI properties be carried across untouched particularly troubling: this decision makes it impossible to build reliable type-aware applications in an environment where XInclude processing may occur. We respectfully dissentand request that: (a) the specification enumerate precisely which infoset properties are affected by the inclusion operation, and how they are affected; (b) the specification require that infoset consistency be preserved; (c) most particularly that PSVI properties be either carried across so as to maintain consistency or not be carried across at all. (2) We are deeply concerned that the XInclude specification interferes with meaningful type-aware processing. Some of the arguments are similar to those of http://tigger.uic.edu/~cmsmcq/tech/xml/munging.html. By raising an infrastructural process to the same architectural level as an application process, an ambiguity arises. Since it is not possible to know whether XInclude will be applied before or after validation it becomes difficult to write schemas (and/or DTDs) that correctly describe instances that use XInclude, requiring the schema to either use an overabundance of disjunctions (xml:include | myElement) throughout the schema or "lie" at some point in the processing about the logical structure of the instances. Ubiquitous disjunctions are non-trivial to implement and may substantially harm the logical model of a schema. Some of our members have suggested that replacing the magic element with a magic type or a magic wildcard (any) would smooth the integration, but we have no consensus or concrete proposals at this time. In general, there are architectural questions raised by the ambiguities inherent in combining, for example, XInclude with type-aware XPath. We believe these questions must be carefully considered and resolved. We recognize that resolving these questions should not fall solely on the XInclude specification alone: they are larger questions. We hope to work with the Core WG to help resolve these important architectural questions, which be believe must be resolved, and look forward to the Processing Model Workshop as a forum for progress on these issues. (3) We consider it a mistake to erase all record that XInclude processing has occurred. This damages the usability of this specification for many applications, such as distributed editing, document packaging, and so on. Leaving a trace may well be part of a solution to (2) above. We do not find the fact that the current Infoset specification does not mandate properties recording a trace of external entities a reason for XInclude to do likewise for two reasons: (1) some feel that that decision for Infoset was not a wise one, and (2) XInclude processing, unlike external entity resolution, is not guaranteed to occur before parsing and validation (and indeed that is the point of using an XML syntax for inclusion!). The preponderance of the opinion in the Schemas WG was that this is a very important issue than must be addressed, although a minority felt it was less crucial. (4) We wonder why the decision was made to specifically violate the RFCs for how fragment identifiers should be interpreted, in favour of a mandated interpretation. We do not consider it wise, in general, to run counter to the relevant IETF specifications. We do not see the rationale of forbidding, say, a schema-specific pointing syntax defined at the logical component model level being used with XInclude to compose schema documents. We raise this as a general architectural question and ask for clarification of the rationale. (5) The included XML Schema fragment does not quite capture the expressed constraints. We suggest that the attribute 'parse' should be defined with use='default' and value='xml' and that the anyAttribute be defined with namespace='##other'. Also the DTD specifies that the include element must be empty while the schema specifies that the include element can have character information item children. We suggest the schema should be; <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:xi="http://www.w3.org/2001/XInclude" targetNamespace="http://www.w3.org/2001/XInclude"> <xs:element name="include"> <xs:complexType> <xs:attribute name="href" type="xs:anyURI" use="required" /> <xs:attribute name="parse" use="optional" default="xml" > <xs:simpleType> <xs:restriction base="xs:string"> <xs:enumeration value="xml"/> <xs:enumeration value="text"/> </xs:restriction> </xs:simpleType> </xs:attribute> <xs:attribute name="encoding" use="optional" type="xs:string" /> <xs:anyAttribute namespace="##other" /> </xs:complexType> </xs:element> </xs:schema> (6) We are doubtful whether it is appropriate to mandate normalized characters in all circumstances. We reiterate our comments on the Character Model for the Web: "Early uniform normalization appears to have a laudable goal, but it is no clear that it is a reliable way, let alone the best way, to achieve that goal. It places a heavy burden on footprint-constrained software, and (as defined in this document) leaves downstream users more or less at the mercy of upstream software over which they have no control. We believe serious attention should be given to other normalization forms for Unicode (e.g. the decomposed normal form) and to other regimes for deciding who should normalize when." We raise this as a general important architectural question, and suggest that if the Character Model specification backs off from requiring early normalization, the XInclude specification do likewise. Respectfully, the XML Schema WG
Received on Tuesday, 19 June 2001 19:58:37 UTC