- From: Mary Holstege <holstege@mathling.com>
- Date: Tue, 19 Jun 2001 16:59:47 -0700
- To: www-xml-xinclude-comments@w3.org, w3c-xml-core-wg@w3.org
XML Schema Working Group comments on XML Inclusions Last Call Working Draft
We believe that the XInclude specification defines a foundation
specification that has to be harmonized carefully with the other
foundation specifications. The following points outline our concerns
with the specification as it stands;
(1) While the Infoset specification countenances synthetic infosets
that do not maintain the normal consistency relations of infosets
created directly by parsing XML, we consider it a poor idea in
general to take advantage of that laxity, particularly in the case
of such a foundational specification.
We believe the XInclude specification must be crystal clear which
Infoset properties are adjusted, and how, and further that it
should specify rules so that core invariants are maintained. Since
a downstream application has no markers in an Infoset that the
XInclude process has occurred, it is unacceptable to create an
Infoset that cannot be processed in the normal way. The XInclude
specification itself highlights several situtations that call out
for special processing. We call on the specification not to
satisfy itself with highlighting the problems, but to solve
them. Among these are:
namespace handling
base handling
name collisions on notations and entities
PSVI properties
We find the statement that PSVI properties be carried across
untouched particularly troubling: this decision makes it
impossible to build reliable type-aware applications in an
environment where XInclude processing may occur.
We respectfully dissentand request that:
(a) the specification enumerate precisely which infoset properties are
affected by the inclusion operation, and how they are affected;
(b) the specification require that infoset consistency be preserved;
(c) most particularly that PSVI properties be either carried across
so as to maintain consistency or not be carried across at all.
(2) We are deeply concerned that the XInclude specification interferes
with meaningful type-aware processing. Some of the arguments are
similar to those of
http://tigger.uic.edu/~cmsmcq/tech/xml/munging.html. By raising
an infrastructural process to the same architectural level as an
application process, an ambiguity arises. Since it is not possible
to know whether XInclude will be applied before or after validation
it becomes difficult to write schemas (and/or DTDs) that correctly
describe instances that use XInclude, requiring the schema to either
use an overabundance of disjunctions (xml:include | myElement) throughout
the schema or "lie" at some point in the processing about the logical
structure of the instances. Ubiquitous disjunctions are non-trivial
to implement and may substantially harm the logical model of a schema.
Some of our members have suggested that replacing the magic
element with a magic type or a magic wildcard (any) would smooth
the integration, but we have no consensus or concrete proposals at
this time.
In general, there are architectural questions raised by the
ambiguities inherent in combining, for example, XInclude with
type-aware XPath. We believe these questions must be carefully
considered and resolved. We recognize that resolving these
questions should not fall solely on the XInclude specification
alone: they are larger questions.
We hope to work with the Core WG to help resolve these important
architectural questions, which be believe must be resolved, and look
forward to the Processing Model Workshop as a forum for progress on these
issues.
(3) We consider it a mistake to erase all record that XInclude
processing has occurred. This damages the usability of this
specification for many applications, such as distributed editing,
document packaging, and so on. Leaving a trace may well be part of a
solution to (2) above. We do not find the fact that the current
Infoset specification does not mandate properties recording a trace of
external entities a reason for XInclude to do likewise for two reasons:
(1) some feel that that decision for Infoset was not a wise one, and
(2) XInclude processing, unlike external entity resolution, is not
guaranteed to occur before parsing and validation (and indeed that is
the point of using an XML syntax for inclusion!). The preponderance of
the opinion in the Schemas WG was that this is a very important issue than
must be addressed, although a minority felt it was less crucial.
(4) We wonder why the decision was made to specifically violate the
RFCs for how fragment identifiers should be interpreted, in favour
of a mandated interpretation. We do not consider it wise, in
general, to run counter to the relevant IETF specifications. We do
not see the rationale of forbidding, say, a schema-specific
pointing syntax defined at the logical component model level being
used with XInclude to compose schema documents. We raise this as a
general architectural question and ask for clarification of the rationale.
(5) The included XML Schema fragment does not quite capture the
expressed constraints. We suggest that the attribute 'parse'
should be defined with use='default' and value='xml' and that the
anyAttribute be defined with namespace='##other'. Also the DTD
specifies that the include element must be empty while the schema
specifies that the include element can have character information
item children.
We suggest the schema should be;
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:xi="http://www.w3.org/2001/XInclude"
targetNamespace="http://www.w3.org/2001/XInclude">
<xs:element name="include">
<xs:complexType>
<xs:attribute name="href" type="xs:anyURI" use="required" />
<xs:attribute name="parse" use="optional" default="xml" >
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:enumeration value="xml"/>
<xs:enumeration value="text"/>
</xs:restriction>
</xs:simpleType>
</xs:attribute>
<xs:attribute name="encoding" use="optional" type="xs:string" />
<xs:anyAttribute namespace="##other" />
</xs:complexType>
</xs:element>
</xs:schema>
(6) We are doubtful whether it is appropriate to mandate normalized
characters in all circumstances. We reiterate our comments on the
Character Model for the Web:
"Early uniform normalization appears to have a laudable goal, but
it is no clear that it is a reliable way, let alone the best way,
to achieve that goal. It places a heavy burden on
footprint-constrained software, and (as defined in this document)
leaves downstream users more or less at the mercy of upstream
software over which they have no control. We believe serious
attention should be given to other normalization forms for Unicode
(e.g. the decomposed normal form) and to other regimes for
deciding who should normalize when."
We raise this as a general important architectural question, and suggest
that if the Character Model specification backs off from requiring
early normalization, the XInclude specification do likewise.
Respectfully,
the XML Schema WG
Received on Tuesday, 19 June 2001 19:58:37 UTC