Fwd: Canonical XML error

Comment on Canonical XML from ietf list

regards, Frederick

Frederick Hirsch
Nokia



Begin forwarded message:

Resent-From: <w3c-ietf-xmldsig@w3.org<mailto:w3c-ietf-xmldsig@w3.org>>
From: ext Steve DeRose <steve.derose@openamplify.com<mailto:steve.derose@openamplify.com>>
Date: August 30, 2011 9:20:14 AM EDT
To: <jboyer@PureEdge.com<mailto:jboyer@PureEdge.com>>
Cc: <public-xmlsec@we.org<mailto:public-xmlsec@we.org>>, <w3c-ietf-xmldsig@w3.org<mailto:w3c-ietf-xmldsig@w3.org>>
Subject: Canonical XML error

I recently discovered that the Canonical XML spec does not appear to specify  which of several possible options to use, to encode the literal string "]]>" in content. I have also checked the errata, and cannot find this mentioned there.

This strings marks the end of an XML CDATA marked section, so must be escaped somehow when needed literally. It seems to me that the best choice given other decisions in Canonical XML, is to express it as  "]]&gt;". That is the method used in the source for the current edition of the XML Recommendation. But of course there are multiple alternatives, including at least:


    &#x5D;]>
    ]&#x5D;>
    ]]&#x3E;
    &#x5D;&#x5D;>
    &#x5D;]&#x3E;
    &#x5D;&#x5D;&#x3E;
    &#x5D;]&gt;
    &#x5D;&#x5D;&gt;


Clearly, if different users or applications encode the same intended content in different ways, that's a problem in the context of Canonical XML. Whether the string is common is irrelevant. Yet, there are contexts where this string naturally occurs: the most obvious are documents describing XML, and documents containing program code examples such as "a[b[0]]>1".

Please specify a specific encoding for this string in Canonical XML documents.

Steve DeRose
sderose@acm.org<mailto:sderose@acm.org>

Received on Tuesday, 6 September 2011 12:35:20 UTC