Canonical XML error

I recently discovered that the Canonical XML spec does not appear to
specify  which of several possible options to use, to encode the literal
string "]]>" in content. I have also checked the errata, and cannot find
this mentioned there.

This strings marks the end of an XML CDATA marked section, so must be
escaped somehow when needed literally. It seems to me that the best
choice given other decisions in Canonical XML, is to express it as
"]]>". That is the method used in the source for the current edition
of the XML Recommendation. But of course there are multiple
alternatives, including at least:


    ]]>
    ]]>
    ]]>
    ]]>
    ]]>
    ]]>
    ]]>
    ]]>


Clearly, if different users or applications encode the same intended
content in different ways, that's a problem in the context of Canonical
XML. Whether the string is common is irrelevant. Yet, there are contexts
where this string naturally occurs: the most obvious are documents
describing XML, and documents containing program code examples such as
"a[b[0]]>1".

Please specify a specific encoding for this string in Canonical XML
documents.

Steve DeRose
sderose@acm.org

Received on Monday, 5 September 2011 13:07:49 UTC