W3C home > Mailing lists > Public > w3c-ietf-xmldsig@w3.org > July to September 2011

Canonical XML error

From: Steve DeRose <steve.derose@openamplify.com>
Date: Tue, 30 Aug 2011 09:20:14 -0400
To: jboyer@PureEdge.com
Cc: public-xmlsec@we.org, w3c-ietf-xmldsig@w3.org
Message-ID: <1314710414.10641.25.camel@sderose-ThinkPad-T400>
I recently discovered that the Canonical XML spec does not appear to
specify  which of several possible options to use, to encode the literal
string "]]>" in content. I have also checked the errata, and cannot find
this mentioned there.

This strings marks the end of an XML CDATA marked section, so must be
escaped somehow when needed literally. It seems to me that the best
choice given other decisions in Canonical XML, is to express it as
"]]&gt;". That is the method used in the source for the current edition
of the XML Recommendation. But of course there are multiple
alternatives, including at least:


    &#x5D;]>
    ]&#x5D;>
    ]]&#x3E;
    &#x5D;&#x5D;>
    &#x5D;]&#x3E;
    &#x5D;&#x5D;&#x3E;
    &#x5D;]&gt;
    &#x5D;&#x5D;&gt;


Clearly, if different users or applications encode the same intended
content in different ways, that's a problem in the context of Canonical
XML. Whether the string is common is irrelevant. Yet, there are contexts
where this string naturally occurs: the most obvious are documents
describing XML, and documents containing program code examples such as
"a[b[0]]>1".

Please specify a specific encoding for this string in Canonical XML
documents.

Steve DeRose
sderose@acm.org
Received on Monday, 5 September 2011 13:07:49 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 5 September 2011 13:07:51 GMT