- From: John Boyer <jboyer@PureEdge.com>
- Date: Wed, 13 Sep 2000 09:33:19 -0700
- To: "Gregor Karlinger" <gregor.karlinger@iaik.at>, "XMLSigWG" <w3c-ietf-xmldsig@w3.org>
Hi Gregor, Thanks for writing. Please respond to this letter even if you agree with its content since Joseph has asked that we document positive acknowledgement of the resolution of issues raised by the WG (esp. if we intend the resolution to be not to change anything). First, note that there is no requirement to use XPath to build the required part of C14N. One's implementation must only behave in the same way, which includes replacing the CDATA sections with their character content. The section about text nodes that you cited [XPath, Section 5.7] mentions CDATA sections in the discussion of text nodes, but it is in order to say *exactly* the same thing we are saying: "Character data is grouped into text nodes. As much character data as possible is grouped into each text node: a text node never has an immediately following or preceding sibling that is a text node." "Each character within a CDATA section is treated as character data." I believe it is best to make it clear up front that CDATA sections will be replaced rather than burying it in text node processing because it is easier for people who don't implement with XPath to see what must be done to make a logically equivalent structure appropriate to their purposes. My opinion is that the discussion of the processing model should be based on the data model, unencumbered by details of how to create the data model. To me, this is esp. important given that some may choose to actually implement based on XPath (e.g. in order to perform document subsetting), and such people want to read the processing model to figure out how to process the data structure they have. As to your point about whether this concatenation is a common feature of XML processors, hopefully it is clear that it doesn't matter. Concatenation of successive blocks of character data is a trivial task. Implementers can either choose to do it in their data structure, or they can acknowledge that since they are consecutive in the input, simply outputing the text as encountered suffices. It is important to note that while many processors may report separate character blocks for CDATA sections, they are not required (and some do not) distinguish that the text came from a CDATA section because: "CDATA sections may occur anywhere character data may occur; they are used to escape blocks of text containing characters which would otherwise be recognized as markup." [XML, Section 2.7] (In other words, CDATA sections are a simple escaping mechanism). [XML] http://www.w3.org/TR/1998/REC-xml-1998021 [XPath] http://www.w3.org/TR/1999/REC-xpath-19991116 John Boyer Development Team Leader, Distributed Processing and XML PureEdge Solutions Inc. Creating Binding E-Commerce v: 250-479-8334, ext. 143 f: 250-479-3772 1-888-517-2675 http://www.PureEdge.com <http://www.pureedge.com/> -----Original Message----- From: w3c-ietf-xmldsig-request@w3.org [mailto:w3c-ietf-xmldsig-request@w3.org]On Behalf Of Gregor Karlinger Sent: Wednesday, September 13, 2000 5:58 AM To: XMLSigWG; John Boyer Subject: Canonical XML Comment (CDATA) Hi John, In section 2.1 there are listed the requirements for an XML processor to be used to create a node set: 3. replace CDATA sections with their character content I am not sure, but I don't think this is standard behaviour of the widespread XML parser implementations. Wouldn't it be better to skip this requirement and instead add a sencence to the explanation how to serialize Text nodes in section 2.2? The XPath data model also mentions CDATA sections in the explanation of the data model [1]. [1] http://www.w3.org/TR/xpath#section-Text-Nodes Regards, Gregor --------------------------------------------------------------- Gregor Karlinger mailto://gregor.karlinger@iaik.at http://www.iaik.at Phone +43 316 873 5541 Institute for Applied Information Processing and Communications Austria ---------------------------------------------------------------
Received on Wednesday, 13 September 2000 12:33:34 UTC