- From: C. M. Sperberg-McQueen <cmsmcq@acm.org>
- Date: 08 Dec 2005 14:14:58 -0700
- To: W3C XML Schema Comments list <www-xml-schema-comments@w3.org>
- Cc: "Henry S. Thompson" <ht@cogsci.ed.ac.uk>, Michael Kay <mhk@mhk.me.uk>, Dave Remy <dremy@microsoft.com>, Ashok Malhotra <ashok.malhotra@oracle.com>
A long time ago [1], I took an action to draft a short note about how a schema processor could be made to handle attributes in the XML namespace without being non-conforming. The following text describes the problem as it was reported to us, and the solutions we saw for it. If the WG is interested, we can turn this email into a WG Note and publish it formally; otherwise, it may suffice to archive this email on a public mailing list so that people can be referred to it for information on why the XML Schema WG does not plan to make the attributes of the XML namespace magical. For that reason, I am sending this note to the XML Schema comments list. Some time has passed since the original problem report, and I have not heard any recent comments on the topic; I would be glad to know if the situation has changed since the problem was reported. Do the relevant frameworks now provide ways to work around the problem? Or is this note still relevant? 1 The problem Conforming XInclude processors insert xml:base attributes at the root of included material; this causes the output to be labeled invalid if it is then validated against a schema which did not provide for xml:base attributes on those elements. Under these circumstances, XInclude and XML Schema 1.0 are hard to use together. What can be done to ease the pain? 2 Background This issue was discussed by the XML Schema Working Group at some length [1], after initial reports of the issue in [2] and [3] and proposals by Henry Thompson [4] and Michael Kay [5]. (All of these links are member-only material; their technical content is summarized here for the benefit of those without access to W3C member-ony material.) In [2], Dave Remy of Microsoft suggested that the XML namespace ought to be treated the same way as the XSD 1.0 document-instance namespace (which I'll just call XSI). This would mean two things. First of all, attributes in the namespace would not need to be declared, but would be allowed on any element at all. Second, they would be validated, wherever they appear, using attribute declarations built in to the schema processor. In the case of XSI, these properties are a consequence of clauses 3 and 4 of validation rule Element Locally Valid (Element) and related material. Similar clauses could be introduced for attributes in the XML namespace. The suggestion in [2] was motivated by the unexpected rejection by schema validators of documents produced as by XInclude processing. XInclude sets an xml:base attribute on the root of each inclusion, and this attribute will be rejected as undeclared if the schema does not declare it for the element in question and has no matching wildcard. In [3], Ashok Malhotra of Oracle seconded Remy's suggestion and pointed to technical discussion in [6] (later moved to [7]), in which Daniel Cazzulino argues that unless something is done, XInclude and XML Schema will effectively be unusable together; Cazzulino suggests that the .NET XmlReaderSettings class be modified to allow a property requesting that attributes in the XML namespace be ignored for validation. (Note that this is not quite treating the XML and XSI namespaces in the same way, as suggested by Dave Remy.) In [4], Henry Thompson outlines three possible approaches to this question. We could allow xml:* attributes anywhere by default, we could maintain the status quo, or we could add a mechanism to make it easier to declare elements which should be allowed on all elements. In [5], Michael Kay suggested that the third of Thompson's three approaches should be followed: I tend to think that the right answer is some kind of "global attribute use" - a declaration that any element in a particular namespace may carry certain attributes, identified either specifically by name or generically by namespace. The XML and XSI namespaces shouldn't be treated specially.) When the XML Schema Working Group discussed the question, members recalled that the XML spec explicitly suggests that in specific applications, attributes in the XML namespace should be controlled by declarations in a schema language: in the given context, the xml:lang attribute (for example) might be restricted to a small number of possible values, or have different default values. This was too powerful a tool to allow them to be comfortable with the idea of treating the XML namespace as proposed in [2] or [3]. The proposal in [5] was attractive but not useful in the context of XML Schema 1.0, and difficult to reconcile with the compatibility policy in place for XML Schema 1.1. 3 Accommodating XML-namespace attributes in a conforming processor There are several mechanisms that can be used to allow the successful validation of XInclude output using an XML Schema that has not been tailored for this situation. All can be implemented in conforming processors, without any change to either specification. (1) An infoset-to-infoset transformation process can strip out the xml:base attributes and the [base uri] properties which depend on them. This could be implemented and invoked as a post-processor called by an XInclude processor. (The output would be indistiguishable from that produced by an XInclude processor operating in a "no xml:base attributes" mode, but because the output is not the output of the XInclude process but of the post-processor, the XInclude processor need not be non-conforming as a result.) It could also be implemented and invoked as a pre-processor called by an XSD 1.0 validator. Schema validity assessment is a process which accepts an information set as input; in the common case, the user wished to validate the infoset produced by a conforming XML parser, but this situation is not required for conformance. The user can desire that validation be applied to some other infoset (e.g. the same infoset after removal of all xml:base attributes) without requiring a non-conforming schema validator. (2) An infoset-to-infoset transformation process can strip out the xml:base attributes while leaving the [base uri] properties set as they were in the input. Like solution (1), this could be implemented either as a post-processor for XInclude or as a pre-processor for an XML Schema validator. (3) A schema-construction option can be provided which augments each complex type in the schema to allow an xml:base attribute (or, optionally, which augments only those types associated with elements which in fact have an xml:base attribute). The XML Schema 1.0 spec is quite clear that schema components may be constructed or acquired by a schema validator in any way the implementors may think of and choose to implement. Constructing the components on the basis of schema documents is one very important way, and it is called out with its own level of conformance. But there is no rule in XML Schema 1.0 which forbids a processor from offering a different method of acquiring components. In this case, the processor would provide a method which involves (1) acquiring schema components by reading schema documents or in some other way, and then (2) augmenting the complex types which cannot accept an xml:base attribute as needed. Implementors will need to exercise care to ensure that extension and restriction relations between complex types are not rendered invalid by this augmentation. Note that one problem identified in [6] and [7] is unaddressed by this mechanism: if elements declared with simple types appear as the root of included material and carry xml:base attributes, they will be invalid. The schema construction method could be extended to derive complex types from those simple types and allow the xml:base attribute, but this might confuse downstream applications which expect the elements in question to be carrying xsd:integer or some other simple type in their {type definition} property. New mechanisms in XML Schema to make it easier to declare attributes as legal on any element at all (as proposed in [5]) may be desirable in the long run, but they are not essential in the short run. Any of these three approaches can be implemented by a conforming XML Schema 1.0 processor, and some of them also by a conforming XInclude processor, or by an XML processing environment. [1] http://www.w3.org/2005/04/15-xmlschema-minutes.html#item05 [2] http://lists.w3.org/Archives/Member/w3c-xml-schema-ig/2005Jan/0024.html [3] http://lists.w3.org/Archives/Member/w3c-xml-schema-wg/2005Mar/0030.html [4] http://lists.w3.org/Archives/Member/w3c-xml-schema-ig/2005Jan/0026.html [5] http://lists.w3.org/Archives/Member/w3c-xml-schema-wg/2005Apr/0025.html [6] http://weblogs.asp.net/cazzu/archive/2005/01/10/XsdAndXInclude.aspx [7] http://clariusconsulting.net/blogs/kzu/archive/2005/01/10/XsdAndXInclude.aspx
Received on Thursday, 8 December 2005 21:15:15 UTC