RE: Canonical XML Comment (CDATA) from John Boyer on 2000-09-13 (w3c-ietf-xmldsig@w3.org from July to September 2000)

From: John Boyer <jboyer@PureEdge.com>
Date: Wed, 13 Sep 2000 12:43:04 -0700
To: "Gregor Karlinger" <gregor.karlinger@iaik.at>, "XMLSigWG" <w3c-ietf-xmldsig@w3.org>
Message-ID: <BFEDKCINEPLBDLODCODKIEBFCFAA.jboyer@PureEdge.com>
Hi Gregor,

Yes, we also felt it necessary to make this clear, but we felt the need to
make it clear right up front.  Rather than tweaking the fourth paragraph of
2.1, we devote the entire first paragraph of 2.1 to making this clear:

"The data model defined in the XPath 1.0 Recommendation [XPath] is used to
represent the input XML document or document subset. Implementations SHOULD
but need not be based on an XPath implementation. XML canonicalization is
defined in terms of the XPath definition of a node-set, and implementations
MUST produce equivalent results."

Does this suffice?

Thanks,
John Boyer
Development Team Leader,
Distributed Processing and XML
PureEdge Solutions Inc.
Creating Binding E-Commerce
v: 250-479-8334, ext. 143  f: 250-479-3772
1-888-517-2675   http://www.PureEdge.com <http://www.pureedge.com/>



-----Original Message-----
From: Gregor Karlinger [mailto:gregor.karlinger@iaik.at]
Sent: Wednesday, September 13, 2000 9:57 AM
To: John Boyer; XMLSigWG
Subject: AW: Canonical XML Comment (CDATA)


Hi John,

I agree with you, but maybe you could tweak the fourth paragraph of
section 2.1, so that it gets clear that the XML parser described there
is only a hypothetical model.

Regards, Gregor
---------------------------------------------------------------
Gregor Karlinger
mailto://gregor.karlinger@iaik.at
http://www.iaik.at
Phone +43 316 873 5541
Institute for Applied Information Processing and Communications
Austria
---------------------------------------------------------------



> -----Ursprüngliche Nachricht-----
> Von: John Boyer [mailto:jboyer@PureEdge.com]
> Gesendet: Mittwoch, 13. September 2000 18:33
> An: Gregor Karlinger; XMLSigWG
> Betreff: RE: Canonical XML Comment (CDATA)
>
>
> Hi Gregor,
>
> Thanks for writing.  Please respond to this letter even if you agree with
> its content since Joseph has asked that we document positive
> acknowledgement
> of the resolution of issues raised by the WG (esp. if we intend the
> resolution to be not to change anything).
>
> First, note that there is no requirement to use XPath to build
> the required
> part of C14N.  One's implementation must only behave in the same
> way, which
> includes replacing the CDATA sections with their character content.  The
> section about text nodes that you cited [XPath, Section 5.7]
> mentions CDATA
> sections in the discussion of text nodes, but it is in order to say
> *exactly* the same thing we are saying:
>
> "Character data is grouped into text nodes. As much character data as
> possible is grouped into each text node: a text node never has an
> immediately following or preceding sibling that is a text node."
>
> "Each character within a CDATA section is treated as character data."
>
> I believe it is best to make it clear up front that CDATA sections will be
> replaced rather than burying it in text node processing because
> it is easier
> for people who don't implement with XPath to see what must be
> done to make a
> logically equivalent structure appropriate to their purposes.
>
> My opinion is that the discussion of the processing model should
> be based on
> the data model, unencumbered by details of how to create the data
> model.  To
> me, this is esp. important given that some may choose to actually
> implement
> based on XPath (e.g. in order to perform document subsetting), and such
> people want to read the processing model to figure out how to process the
> data structure they have.
>
> As to your point about whether this concatenation is a common
> feature of XML
> processors, hopefully it is clear that it doesn't matter.
> Concatenation of
> successive blocks of character data is a trivial task.  Implementers can
> either choose to do it in their data structure, or they can
> acknowledge that
> since they are consecutive in the input, simply outputing the text as
> encountered suffices.
>
> It is important to note that while many processors may report separate
> character blocks for CDATA sections, they are not required (and
> some do not)
> distinguish that the text came from a CDATA section because:
>
> "CDATA sections may occur anywhere character data may occur; they are used
> to escape blocks of text containing characters which would otherwise be
> recognized as markup."
> [XML, Section 2.7]
>
> (In other words, CDATA sections are a simple escaping mechanism).
>
> [XML] http://www.w3.org/TR/1998/REC-xml-1998021
> [XPath] http://www.w3.org/TR/1999/REC-xpath-19991116
>
> John Boyer
> Development Team Leader,
> Distributed Processing and XML
> PureEdge Solutions Inc.
> Creating Binding E-Commerce
> v: 250-479-8334, ext. 143  f: 250-479-3772
> 1-888-517-2675   http://www.PureEdge.com <http://www.pureedge.com/>
>
>
>
> -----Original Message-----
> From: w3c-ietf-xmldsig-request@w3.org
> [mailto:w3c-ietf-xmldsig-request@w3.org]On Behalf Of Gregor Karlinger
> Sent: Wednesday, September 13, 2000 5:58 AM
> To: XMLSigWG; John Boyer
> Subject: Canonical XML Comment (CDATA)
>
>
> Hi John,
>
> In section 2.1 there are listed the requirements for an XML
> processor to be used to create a node set:
>
>  3. replace CDATA sections with their character content
>
> I am not sure, but I don't think this is standard behaviour of
> the widespread XML parser implementations.
>
> Wouldn't it be better to skip this requirement and instead
> add a sencence to the explanation how to serialize Text nodes
> in section 2.2?
>
> The XPath data model also mentions CDATA sections in the
> explanation of the data model [1].
>
> [1] http://www.w3.org/TR/xpath#section-Text-Nodes
>
> Regards, Gregor
> ---------------------------------------------------------------
> Gregor Karlinger
> mailto://gregor.karlinger@iaik.at
> http://www.iaik.at
> Phone +43 316 873 5541
> Institute for Applied Information Processing and Communications
> Austria
> ---------------------------------------------------------------
>
>
>
>
Received on Wednesday, 13 September 2000 15:43:08 UTC