- From: Kay Michael <Michael.Kay@icl.com>
- Date: Mon, 24 Jan 2000 12:37:31 -0000
- To: "'www-xml-canonicalization-comments@w3.org'" <www-xml-canonicalization-comments@w3.org>
- Cc: "'www-xml-infoset-comments@w3.org'" <www-xml-infoset-comments@w3.org>
I am confused as to the relationship between Canonical XML and the InfoSet. There seems a need for a stronger policy statement. A simplistic approach would say "if it's in the core Infoset, it's present in Canonical XML, if it isn't, it isn't". Instead we seem to have a pick-and-choose approach. For example, namespace prefixes are in the core infoset but not in canonical XML. Unparsed entities are also in the core InfoSet but seem to be omitted from Canonical XML. This seems merely to perpetuate the tradition of each standard in the XML family making its own decisions about which aspects of an XML document are significant and which are not; the effect is to increase confusion about the true semantics of XML rather than to reduce it. My own preferred approach would be to make the InfoSet and Canonical XML a single document, with the latter describing a concrete algorithm for extracting the information items and properties described in the former. That still leaves the problem that the model differs from the one used in XSLT and XPath. It seems very unfortunate that an XSLT processor that always generates Canonical XML, or one that canonicalizes its input before processing, will not conform to the XSLT standard. Because, for example, its handling of comments will not meet the XSLT specification. This also means that Canonical XML is not useful for conformance testing of XSLT processors. The relationship with the XML Namespaces standard also needs to be spelt out more clearly. It's not clear whether the Canonical XML and Infoset standards are intended to apply to any XML document, or only to a document that also conforms to XML Namespaces. A more detailed point (for Canonical XML): in 5.6 the condition "When the element type and the attribute names do not have namespaces" needs to be spelt out more pedantically: one could argue that every element type has a namespace. Mike Kay
Received on Monday, 24 January 2000 07:39:45 UTC